Today’s (7/28/2021) blog post by acting Census Bureau director Dr Ron Jarmin is essential
reading. And the imbedded youtube videos help us understand what’s going on.
Here’s a link to the blog post:
The What is Redistricting video:
The Protecting Privacy video:
Here’s a snippet from the Director’s post:
"With these [privacy protecting] parameters, some small areas like census blocks may
look “fuzzy,” meaning that the data for a particular block may not seem correct.
Importantly, our approach yields high quality data as users combine these "fuzzy”
blocks to form more significant geographic units like census tracts, cities, voting
districts, counties, and American Indian/Alaska Native tribal areas. Our calibration was
designed to achieve acceptable quality thresholds for these levels of geography.
So, if you’re looking at block-level data, you may notice situations like the following:
Occupancy status doesn’t match population counts. Some blocks may show that the housing
units are all occupied, but the population count is zero. Other blocks may show the
reverse: the housing units are vacant, but the population count is greater than zero.
Children appear to live alone. Some blocks may show a population count for people under
age 18 but show no people age 18 and older.
Households appear unusually large. For example, you may find blocks with 45 people, but
only three housing units.
Though unusual, situations like these in the data help confirm that confidentiality is
Noise in the block-level data will require a shift in how some data users typically
approach using these census data.
Instead of looking for precision in an individual block, we strongly encourage data users
to aggregate, or group, blocks together. As blocks are grouped together, the fuzziness
disappears. And when you step back with more blocks in view, the details add together and
make a sharp picture. "
# # #
So, as I understand it, it’s not that the Bureau is taking complete microdata records (the
household, household members, etc.) and sprinkling them randomly within a census tract,
but each individual variable is independently (?) modeled/simulated using their privacy
protection parameters. More or less, I guess.
If I was a City Planner, the variable that I would/should be most certain of is the count
of dwelling units at my city block level. I wouldn’t readily know if those housing units
were occupied or vacant, but I could believe with my own eyes(and aerial photography) that
a housing unit is present. But if “housing units” are fuzzified (?) using differential
privacy, then, meh…..
Fuzzy Wuzzy was a bear, Fuzzy Wuzzy had no hair, Fuzzy Wuzzy wasn’t fuzzy, was he?