From: American Community Survey Data Users Group <noreply(a)prb.org>
Sent: Friday, July 30, 2021 9:18 AM
To: Weinberger, Penelope <pweinberger(a)aashto.org>
Subject: Resources for 2020 ACS Data Release
[cid:ACS+DUS+logo+from+Test+Site-jpg_2D00_150x0-jpg@prb.org]
Update from American Community Survey Data Users Group
The Census Bureau announced yesterday that they will not be releasing their standard 2020 ACS 1-year data products in September as planned due to the impact of the pandemic on data quality. Instead, they will be releasing a set of experimental estimates from the 1-year data. They have created several resources to help data users prepare for this change:
* A press kit<https://www.census.gov/newsroom/press-kits/2021/impact-pandemic-2020-acs-1-…> includes a PDF version of yesterday's webinar, background materials explaining what experimental data products are, and information about the Census Bureau’s statistical quality standards. A recording of the webinar should be available by late Monday.
* A revised 2020 ACS release schedule<https://www.census.gov/programs-surveys/acs/news/data-releases/2020/release…> includes the complete schedule with new planned release dates.
* The ACS Resource Hub<https://www.census.gov/programs-surveys/acs/library/flyers/resource-hub.html> flyer and 2020 ACS 1-Year Estimates: What You Need to Know<https://www.census.gov/programs-surveys/acs/library/flyers/flow-chart.html> flyer provide more information.
You were sent this email because an administrator sent it to all users in the Everyone role on American Community Survey Data Users Group.
This may be of interest to some.
-------- Forwarded Message --------
Subject: Resources for 2020 ACS Data Release
Date: 30 Jul 2021 13:17:45 +0000
From: American Community Survey Data Users Group <noreply(a)prb.org>
To: Ed Christopher <edc(a)berwyned.com>
[Population Reference Bureau] Update from American Community Survey
Data Users Group
<https://acsdatacommunity.prb.org/>
The Census Bureau announced yesterday that they will not be releasing
their standard 2020 ACS 1-year data products in September as planned due
to the impact of the pandemic on data quality. Instead, they will be
releasing a set of experimental estimates from the 1-year data. They
have created several resources to help data users prepare for this change:
* A press kit
<https://www.census.gov/newsroom/press-kits/2021/impact-pandemic-2020-acs-1-…> includes
a PDF version of yesterday's webinar, background materials
explaining what experimental data products are, and information
about the Census Bureau’s statistical quality standards. A recording
of the webinar should be available by late Monday.
* A revised 2020 ACS release schedule
<https://www.census.gov/programs-surveys/acs/news/data-releases/2020/release…> includes
the complete schedule with new planned release dates.
* The ACS Resource Hub
<https://www.census.gov/programs-surveys/acs/library/flyers/resource-hub.html> flyer
and 2020 ACS 1-Year Estimates: What You Need to Know
<https://www.census.gov/programs-surveys/acs/library/flyers/flow-chart.html> flyer provide more
information.
You were sent this email because an administrator sent it to all users
in the Everyone role on American Community Survey Data Users Group.
Today’s (7/28/2021) blog post by acting Census Bureau director Dr Ron Jarmin is essential reading. And the imbedded youtube videos help us understand what’s going on.
Here’s a link to the blog post:
https://www.census.gov/newsroom/blogs/director/2021/07/redistricting-data.h… <https://www.census.gov/newsroom/blogs/director/2021/07/redistricting-data.h…>
The What is Redistricting video:
https://youtu.be/O0MhAue2Tuk <https://youtu.be/O0MhAue2Tuk>
The Protecting Privacy video:
https://www.youtube.com/watch?v=1AaoaBcHoss <https://www.youtube.com/watch?v=1AaoaBcHoss>
Here’s a snippet from the Director’s post:
"With these [privacy protecting] parameters, some small areas like census blocks may look “fuzzy,” meaning that the data for a particular block may not seem correct. Importantly, our approach yields high quality data as users combine these "fuzzy” blocks to form more significant geographic units like census tracts, cities, voting districts, counties, and American Indian/Alaska Native tribal areas. Our calibration was designed to achieve acceptable quality thresholds for these levels of geography.
So, if you’re looking at block-level data, you may notice situations like the following:
Occupancy status doesn’t match population counts. Some blocks may show that the housing units are all occupied, but the population count is zero. Other blocks may show the reverse: the housing units are vacant, but the population count is greater than zero.
Children appear to live alone. Some blocks may show a population count for people under age 18 but show no people age 18 and older.
Households appear unusually large. For example, you may find blocks with 45 people, but only three housing units.
Though unusual, situations like these in the data help confirm that confidentiality is being protected.
Noise in the block-level data will require a shift in how some data users typically approach using these census data.
Instead of looking for precision in an individual block, we strongly encourage data users to aggregate, or group, blocks together. As blocks are grouped together, the fuzziness disappears. And when you step back with more blocks in view, the details add together and make a sharp picture. "
# # #
So, as I understand it, it’s not that the Bureau is taking complete microdata records (the household, household members, etc.) and sprinkling them randomly within a census tract, but each individual variable is independently (?) modeled/simulated using their privacy protection parameters. More or less, I guess.
If I was a City Planner, the variable that I would/should be most certain of is the count of dwelling units at my city block level. I wouldn’t readily know if those housing units were occupied or vacant, but I could believe with my own eyes(and aerial photography) that a housing unit is present. But if “housing units” are fuzzified (?) using differential privacy, then, meh…..
###
Fuzzy Wuzzy was a bear, Fuzzy Wuzzy had no hair, Fuzzy Wuzzy wasn’t fuzzy, was he?
###
Hello all:
The Census 2020 “PL 94-171” file — the “redistricting file” — the “short form” data from the decennial census — will be released by August 16, 2021 in a “legacy format”. This means that it will be released in large data chunks, downloadable from the Census Bureau’s website, but not accessible using the Census API (Application Programming Interface) until (perhaps) September 30, 2021.
This means that the PL 94-171 data will not be available (immediately) via data.census.gov <http://data.census.gov/> or r-programs such as TIDYCENSUS.
Census Bureau’s main page on Census 2020 PL 94-171:
https://www.census.gov/programs-surveys/decennial-census/about/rdo/summary-… <https://www.census.gov/programs-surveys/decennial-census/about/rdo/summary-…>
Census Bureau’s video on the PL 94-171 data release. Watch It!
https://www.youtube.com/watch?v=O0MhAue2Tuk&t=86s <https://www.youtube.com/watch?v=O0MhAue2Tuk&t=86s>
But there are two R program packages available to read in the “legacy format” PL 94-171 files:
CENSUSAPI https://CRAN.R-project.org/package=censusapi <https://cran.r-project.org/package=censusapi>
and
PL94171 https://CRAN.R-project.org/package=PL94171 <https://cran.r-project.org/package=PL94171>
And they work!
I really like PL94171, and will be using it in my analyses between August 16th and to whenever the “API ready” data is available.
Attached are my scripts that test the CENSUSAPI, TIDYCENSUS, and PL94171 for the Census 2010 (Rhode Island, California); and the 2018 Test Census (Providence County, RI).
# # #
And here is the TWITTER stream that alerted me to the presence of the PL94171 package:
from the twitterverse:
Kyle Walker <https://twitter.com/kyle_e_walker>
@kyle_e_walker <https://twitter.com/kyle_e_walker> (July 12, 2021)
tidycensus #rstats <https://twitter.com/search?q=%23rstats> users: this means that the soonest we'll have 2020 redistricting data in the package is early October, though you can work with the raw data yourselves in mid-August by downloading from the FTP site.
Hansi Lo Wang @hansilowang <https://twitter.com/hansilowang> (July 12, 2021)
PRO TIP: If you're confused by the Census Bureau citing two different expected release dates for 2020 census redistricting data — Aug. 16 and Sept. 30 — bureau official Nicholas Jones says in this video: "It's the same data, just in different formats"
youtube.com/watch?v=O0MhAu… <https://t.co/C8yYHFsot1>
Christopher Kenny @Chris_T_Kenny <https://twitter.com/Chris_T_Kenny>
Jul 12 <https://twitter.com/Chris_T_Kenny/status/1414639454548119560>
Replying to @kyle_e_walker
In that interim, @CoryMcCartan <https://twitter.com/CoryMcCartan/> and I have the PL94171 package on CRAN CRAN.R-project.org/package=PL94171 <https://t.co/nnRbEcRJpc>. It has tools to download, read, and process the PL files once they're available from the FTP. (I look forward to once they're tidycensus readable!)
Kyle Walker @kyle_e_walker <https://twitter.com/kyle_e_walker>
Jul 12 <https://twitter.com/kyle_e_walker/status/1414648437447053317>
Replying to @Chris_T_Kenny @CoryMcCartan
excellent, great work!
Chuck Purvis @charleypurvis <https://twitter.com/charleypurvis>
(July 19, 2021)
Replying to @kyle_e_walker
The good news is that TIDYCENSUS can currently read the Census 2010 PL 94-171 data. I was sincerely hoping the Census Bureau had the 2020 PL data “API ready” by August 16, but alas, this doesn’t appear to be the case!
# # #
Hopefully folks can use and improve on my R scripts. I’m still an old-dog-learning-new-tricks with R, so recommendations to improve them would be welcome.
And if you’re really ambitious, you can check out the REDIST package (geographic methods for re-districting!).
cheers,
Chuck Purvis,
Hayward, California
(formerly of the Metropolitan Transportation Commission, San Francisco, California)
# # #
Attachments:
From: American Community Survey Data Users Group <noreply(a)prb.org>
Sent: Tuesday, July 20, 2021 5:47 PM
To: Weinberger, Penelope <pweinberger(a)aashto.org>
Subject: *Updated Link and Date* Census Bureau to Host Webinar on Impact of Pandemic on the American Community Survey
[cid:ACS+DUS+logo+from+Test+Site-jpg_2D00_150x0-jpg@prb.org]
Update from American Community Survey Data Users Group
*Updated Link and Date* Census Bureau to Host Webinar on Impact of Pandemic on the American Community Survey
The U.S. Census Bureau has scheduled a webinar on Thursday, July 29, at 2 p.m. EDT to explain and answer questions related to the impact of the coronavirus pandemic on 2020 ACS statistics and data products.
Speakers:
Michael C. Cook, Sr., chief, Public Information Office, U.S. Census Bureau
Donna M. Daily, chief, American Community Survey Office, U.S. Census Bureau
Login information:
CENSUS WebEx Enterprise Site<https://uscensus.webex.com/mw3300/mywebex/default.do?nomenu=true&siteurl=us…>
Audio conference access information:
Toll-free number: 888-566-5775
Participant passcode: 6714070
Please log in 10-15 minutes early, as some setup is required. An updated browser is recommended. Credentialed media will be able to ask questions following the presentation.
Learn More<https://www.census.gov/newsroom/press-releases/2021/webinar-impact-pandemic…>
You were sent this email because an administrator sent it to all users in the Everyone role on American Community Survey Data Users Group.
Hello to the CTPP listserv. Thought I would share something I started several months ago. I somehow got distracted - - baseball or beer or squirrels or something!
My ex-colleague mentioned a year or so ago the R-package “CTPPr” produced by Westat staff. Thought I would give it a test drive.
It works great for its intended use: table specific downloads of Part 1, 2, 3 data for entire states (or multiple states).
Ideally I would flesh out these examples, using the census mapping options in R, as well as combining it with other census R-packages such as TIGRIS and TIDYCENSUS.
Here’s my “getting started” R script for extracting CTPP 2006/10 and 2012/16 data using the CTPPr package.
Cheers, Chuck Purvis, Hayward, California (formerly with the Metropolitan Transportation Commission, San Francisco, California).
############################################################
# CTPPr_GettingStarted_1.r
# Getting started with the Westat R Package "CTPPr"
# CTPPr is authored by: Anthony Fucci, Alexander Cates
# and Marcelo Simas of Westat
# Note that there isn't an option to select only certain
# tracts, places, TADs, TAZs, WITHIN a state... only the
# ENTIRE state! User may prefer to use the Beyond2020
# software for extracting only some parts within a state!
#
# Examples Developed by Chuck Purvis, Hayward, California
# -- March 21-25, 2021 --
###########################################################
# Install CTPPr onto local computer.. Just need to do once!
# I'm just not sure about etiquette for updating packages!
install.packages('devtools')
devtools::install_github('Westat-Transportation/CTPPr')
# Activate/load the CTPPr library for this script.
library("CTPPr")
# optional libraries to load, depending on how I expand the examples!
library('dplyr')
library("tidyverse")
library("magrittr")
# Produce a VIEWER table of ALL CTPP Tables
# It defaults to the CTPP 2012-16 data. Not sure about 2006-10....
ctpp_tables()
# Demonstrate various examples of the download_ctpp function....
# without geography= or state=, the function defaults to US States,
# 50 US States + District of Columbia + Puerto Rico!
# Package authors strongly recommend using BOTH geography= and
# state= or you will crash the AASHTO servers or worse.
# The default datset is the 2012-2016 CTPP.
# CTPPr can also be used to pull 2006-2010 CTPP data.
# But TABLE NUMBERING can be different between the two databases,
# so be really careful!
# Note that a simple one-cell table will retrieve three variables:
# 1. Geography Variable Name (RESIDENCE, WORKPLACE, or both!)
# 2. Estimate
# 3. Standard Error (the Authors have divided the MOE by 1.645 to yield SE)
# Table A101100 is Total Population! A one-cell table!
# "A" -- based on the entire ACS microdata
# "1" -- residence-based geographies
# "01" -- data universe is "total population"
# "100" -- This is Table #100.
# These three examples pull US States.
totalpop_1216 <- download_ctpp(A101100)
totalpop_1216a <- download_ctpp(A101100,dataset='2016')
totalpop_0610 <- download_ctpp(A101100,dataset='2010')
# This example, weirdly, provides County data for California!
Califpop_1216 <- download_ctpp(A101100,dataset='2016',
geography='State',
state="California")
# Example: Counties within California
Calpop_1216a <- download_ctpp(A101100,dataset='2016',
geography='County',
state="California")
# Example: Places within California
CalPlacePop_1216a <- download_ctpp(A101100,dataset='2016',
geography='Place',
state="California")
# Example: All Census Tracts within California with the Tract Name
CalTractPop_1216a <- download_ctpp(A101100,dataset='2016',
geography='Tract',
state="California",
output="Name")
# Example: Tracts within California, Using the FIPS Code instead of Name
CalTractPop_1216b <- download_ctpp(A101100,dataset='2016',
geography='Tract',
state="California",
output = 'FIPS Code')
# Example: Tracts within California, Split FIPS Code
# probably need some follow-on examples to split the FIPS code
# into 3 separate variables: state, county, tract.
CalTractPop_1216c <- download_ctpp(A101100,dataset='2016',
geography='Tract',
state="California",
output = 'Split FIPS Code')
# Example: PUMAs within California
# And then add a new variable, Coefficient of Variation (SE/Estimate)
CalPUMAPop_1216 <- download_ctpp(A101100,dataset='2016',
geography='PUMA',
state="California",
output = 'Name')
CalPUMAPop_1216$CV =
CalPUMAPop_1216$SE / CalPUMAPop_1216$Estimate
# Example: PUMAs within California, FIPS Code
CalPUMAPop_1216a <- download_ctpp(A101100,dataset='2016',
geography='PUMA',
state="California",
output = 'FIPS Code')
# Example: PUMAs within California, FIPS Code and Name!!
CalPUMAPop_1216b <- download_ctpp(A101100,dataset='2016',
geography='PUMA',
state="California",
output = 'FIPS and Name')
# Example: PUMAs within California, Split FIPS Code
CalPUMAPop_1216c <- download_ctpp(A101100,dataset='2016',
geography='PUMA',
state="California",
output = 'Split FIPS Code')
# Example: Metropolitan Statistical Areas within California
# Note: CTPPr documentation doesn't mention the State>MSA sum level.
# but it works... at least for California
MSA_Pop_1216a <- download_ctpp(A101100,dataset='2016',
geography='MSA',
state="California",
output = 'Name')
# Example: Principal Cities within Each MSA in California.
# Note that MSAs may have multiple Principal Cities!!!
MSACity_Pop_1216 <- download_ctpp(A101100,dataset='2016',
geography='City',
state="California",
output = 'Name')
# for whatever reason, pulling urbanized areas isn't working.
UA_Pop_1216 <- download_ctpp(A101100,dataset='2016',
geography='UA',
# state="California",
output = 'Name')
# Other residence and workplace geography include:
# OUSA (workplace tables only); TAD; and TAZ, and MCD(12 MCD states)
# Pull a two way table: Population by Hispanic (3) by Race (5)
# This yields 15 records (3*5) for each County
# Two Additional columns are included in this particular data frame:
# -- "Hispanic Origin 3"
# -- "Race of Person 5"
Calif_County_Race_Hisp_1216 <-
download_ctpp(A101204,dataset = '2016', output = "FIPS Code",
geography = 'County', state = "California")
# Pull a REALLY big two-way table: Workers by Occupation (25)
# by Industry (15)...
# This yields 21,750 observations for the 58 county California,
# Columns include: RESIDENCE, "Occupation 25", "Industry 15",
# Estimate, and SE.
Calif_County_Workers_Occ_Ind_1216 <-
download_ctpp(A102214,dataset = '2016', output = "FIPS Code",
geography = 'County', state = "California")
# The following doesn't work, even though it's listed in the view
# of available tables. Be careful!!
# I don't think "C" tables are really available ....
Calif_County_Workers_Occ_Ind_1216aaa <-
download_ctpp(A102214C,dataset = '2016', output = "FIPS Code",
geography = 'County', state = "California")
##################################################################
# Workplace Tables! (Part 2 Tables)
# Total Workers at Work by County of Work
# Output columns in the data frame are: WORKPLACE, Estimate, and SE
# "A" -- data is from the entire ACS microdata, not just the 5% in PUMS
# "2" -- these are tabulated at the work-end (or workplace)
# "02" -- the data universe is "total workers at work"
# "100" -- this is table #100
Calif_COW_TotWorkers <-
download_ctpp(A202100,dataset = '2016', output = "Name",
geography = 'County', state = "California")
##################################################################
# Worker Flow Tables (Part 3 Tables)
# Total Worker Flows, County-to-County, Intra-State California
# Intra State California is 58 by 58 counties, for 3,364 records!
# 58 * 58 = 3364.... lots of records with zero commuters!!
# Output columns include: RESIDENCE, WORKPLACE, Estimate, SE
Calif_Co2Co_TotWork <-
download_ctpp(id="A302100",
dataset = '2016',
output = "FIPS and Name",
geography = 'County->County',
state = "California")
# Pull County-to-County workers for four Western US States
# This yields 15,876 records, but a lot of the records have 0 workers!
Calif_Neighbors_Co2Co_TotWork <-
download_ctpp(id="A302100",
dataset = '2016',
output = "FIPS and Name",
geography = 'County->County',
state = "California,Arizona,Nevada,Oregon")
# Perhaps continue the above example to just retain
# Intra-state records, and inter-state into/out of California....
# End of Getting Started Part 1.