People, homes are disappearing because of a new method of privacy of the 2020 census

The colonial-style three-bedroom house in which Jessica Stephenson has lived in Milwaukee for the past six years is full of activity every day of the week, full of children chatting in the nursery from which she leaves her home.

The U.S. Census Bureau says no one lives there.

“We should come and see for ourselves,” Stephenson said.

From its predominantly black neighborhood in Wisconsin to the Hasidic Jewish community in the Catskill Mountains of New York to a park outside Tampa, Florida, a method first used by the Census Bureau to protect confidentiality in the 2020 census, people and occupied homes are disappearing – at least on paper – when they actually exist in the real world.

This is not a magic trick, but a new statistical method used by the office, called differential privacy, which involves intentionally adding errors to data to disguise the identity of any participant.

Office officials say privacy needs to be protected at a time of increasingly sophisticated data mining, as technological innovations increase the threat of people being “re-identified” by using powerful computers to synchronize census data with other public databases. According to the law, census answers are supposed to be confidential.

But some city officials and demographers believe it goes too far from reality – and can lead to errors in the data used to draw political districts and distribute federal funds.

At least one analysis suggests that different privacy could penalize minority communities with too few counts of racially and ethnically mixed areas. Researchers at Harvard University found that the method made it difficult to create political districts with the same population and could result in fewer districts with a majority minority.

However, the Census Bureau claims that the data are as good as in previous censuses and that low inaccuracies are not a big problem.

It is certain that the method can lead to strange, contradictory and erroneous results at the smallest geographical levels, such as neighborhood blocks.

For example, official 2020 census results say 54 people live in Stephenson’s census block in downtown Milwaukee, but also that there are no inhabited homes. In fact, nearly two dozen houses occupy the streets with cars, some more than a century old. Forty-eight residents living in the block are black according to the census, although given the whimsy of differential privacy, it’s hard to know for sure.

In the second case, the census does not list people living in Flatwoods Conservation Park outside Tampa, although it says it is a home where people live. According to Hillsborough County spokesman Todd Pratt, two county employees live there while taking care of park security.

And in the enclave of Hasidic Jews located in Lake Kiamesha in NY, 81 people are recorded as residents, but the census officially says there are no inhabited homes. Sullivan County property records show nearly a dozen homes whose residents are linked to the Vizhnitzer Hasidic community.

Unreliable data has caused headaches for city managers and small community planners, who worry that they may not be valid for decision-making. Eric Guthrie, a senior demographer at the Minnesota State Demographic Center, said he was contacted by half a dozen city managers from across the state who were concerned about possible impacts on state and federal funding.

“I explain to them that there is no method to fix, that this is not a mistake in the traditional sense,” Guthrie said. “The beetle is there according to plan.”

The scope of change becomes clearer when viewed through a wider lens. For Florida, the third most populous country in the country with more than 21 million residents, the 2020 census listed 15,000 neighborhood blocks with a total of 200,000 residents but no inhabited homes. On the other hand, 1,200 of the country’s 484,000 blocks were listed as inhabited homes but uninhabited, says Rich Doty, geographic information system coordinator and research demographer at the University of Florida’s Office of Economic and Business Research.

“We were expecting these anomalies as we were alerted to it by the Census Bureau and other states,” Doty said. “We didn’t expect that so easily.”

Prior to the release of census data used to draw congressional and legislative districts in August, Acting Director of the Census Bureau Ron Jarmin warned that its use could create some “soft” numbers at the neighborhood block level, and urged data users to combine blocks to get accurate results. . But the office also says that despite the implementation of differential privacy, the quality of data for 2020 is no worse than previous censuses based on data quality measurements.

This claim is difficult to assess because raw data is not published without the use of differential privacy, said Stefan Rayer, a demographer at the University of Florida.

“We have to take their word for it,” Rayer said.

Using test data, Harvard researchers found that differential privacy is more likely to underestimate mixed racial and mixed party districts, “which brings unpredictable racial and party biases” because it favors the accuracy of counting the population for the largest racial group in a given area.

“Our findings highlight difficulties in balancing the accuracy and privacy of respondents in the census,” the report said.

The census office does not agree with this and the courts have so far found no reason to stop it.

Earlier this year, the state of Alabama unsuccessfully challenged differential privacy. In a statement for the lawsuit, the chief scientist of the Census Bureau, John Abowd, described the data as “extremely accurate” and said the use of differential privacy did not show bias against racial or ethnic minorities.

“Circulars can remain confident in the accuracy of the population count and the demographic characteristics of the constituencies they draw, despite the noise in the individual building blocks,” Abowd said.

Not everyone believes that technology is the right way to protect confidentiality.

Two researchers from the University of Minnesota wrote in a recent post that the Census Bureau experiment did not show a real threat to confidentiality and that all risks of re-identification are similar to random guessing of household characteristics.

One of them, demographer Steven Ruggles, said during a presentation this month that the Census Bureau’s fear of re-identification and the consequent justification for the use of differential privacy could undermine confidence in census data.

“This should not justify the degradation of our country’s statistical infrastructure,” Ruggles said. “The whole thing is likely to turn around.”

Leave a Comment

error: Content is protected !!