Related to Ed's email below, here is a primer on differential privacy that I found helpful and you might too.
Here is the abstract:
In early 2021, the US Census Bureau will begin releasing statistical tables based on the decennial census conducted in 2020. Because of significant changes in the data landscape, the Census Bureau is changing its approach to disclosure avoidance. The confidentiality of individuals represented “anonymously” in these statistical tables will be protected by a “formal privacy” technique that allows the Bureau to mathematically assess the risk of revealing information about individuals in the released statistical tables. The Bureau’s approach is an implementation of “differential privacy,” and it gives a rigorously demonstrated guaranteed level of privacy protection that traditional methods of disclosure avoidance do not. Given the importance of the Census Bureau’s statistical tables to democracy, resource allocation, justice, and research, confusion about what differential privacy is and how it might alter or eliminate data products has rippled through the community of its data users, namely: demographers, statisticians, and census advocates.
The purpose of this primer is to provide context to the Census Bureau’s decision to use a technique based on differential privacy and to help data users and other census advocates who are struggling to understand what this mathematical tool is, why it matters, and how it will affect the Bureau’s data products.