Life After American Factfinder: TidyCensus Step #0 - ctpp-news

6 Aug 2020

Here’s my first followup to my 7/16/2020 post on using tidycensus in a post-American
Factfinder era.

Attached to this e-mail is a short text file (“r” suffix) that can be edited for your
use.

Example #0. Setting up tidycensus.

This is an introduction to the use of the R-package tidycensus in extracting data from the
US Census Bureau’s American Community Survey. I’m adding snippets of R code from my
R-scripts, and attaching the full r-script to this message.

First things first: Acquaint yourself with the American Community Survey. What I would
strongly recommend is to download and print out copies of the various ACS survey
questionnaires. Know what was asked!

Decennial Census questionnaires:

https://www.census.gov/history/www/through_the_decades/questionnaires/
<https://www.census.gov/history/www/through_the_decades/questionnaires/>
 American Community Survey questionnaires:

https://www.census.gov/programs-surveys/acs/methodology/questionnaire-archi…
<https://www.census.gov/programs-surveys/acs/methodology/questionnaire-archive.html>

Next, I would recommend downloading the “table shells” from the Census Bureau’s website,
and not rely on just on the tidycensus “load_variables” function. Get the table shells for
all of the years: the ACS does change ever so often, and so do the tables! I find it
useful to have part of my computer screen opened with the table shells visible in Excel.

ACS Table Shells:

https://www.census.gov/programs-surveys/acs/technical-documentation/table-s…
<https://www.census.gov/programs-surveys/acs/technical-documentation/table-shells.html>
 I find it useful to have on hand a guide to the ACS table numbering scheme, so you know
your “B” and “C” and “S” and “GCT” tables and the two-digit subject indicator (“08” –
Journey-to-Work”).   https://censusreporter.org/topics/table-codes/
<https://censusreporter.org/topics/table-codes/>
 Download and install the free software package R Studio. There are other YouTube videos
you can watch about learning/installing R and R Studio, and I won’t cover those here.

https://rstudio.com/products/rstudio/download/#download
<https://rstudio.com/products/rstudio/download/#download>

Launch R Studio. There are a few add-on packages that first need to be installed onto your
computer, and then “loaded” into your working R session.

# Step 1 Install R packages. If installed in previous sessions, there is no need to
re-install.

# You may need to install the packages "tidyr" and "sp" for
"tidycensus" to be properly installed.

install.packages("tidyverse")

install.packages("tidycensus")

install.packages("janitor")

# Step 2: Load relevant libraries into each R-session.

library(tidyverse)

library(tidycensus)

library(janitor)

Acquire a Census API key from the Census Bureau. It’s free. It’s a 40 character string
that identifies a unique API user and helps the Census Bureau improve their tools to
access census data. They’ll e-mail you a key in no time at all.

https://www.census.gov/data/developers/updates/new-discovery-tool.html
<https://www.census.gov/data/developers/updates/new-discovery-tool.html>
https://api.census.gov/data/key_signup.html
<https://api.census.gov/data/key_signup.html>

Install your 40-character API key into your R “environment.” Just one time and no need to
concern yourself ever again about this key.

# Step 3: Load the User's Census API Key.

# Census API Key was installed in previous sessions, so no need to re-install

# un-comment out the following statement with the user's API key.

# census_api_key("fortycharacterkeysentbyCensusBureau",install=TRUE)

The last section of this introduction relates to using the “load_variables” as a tool to
assist in selecting various variables. I prefer to download the ACS Table Shells into
Excel, and then have appropriate Table Shells opened, alongside R Studio, to aid me in
variable selection and naming.

# Step 4: Explore the Data Variables using the load_variables() function

# Use the function load_variables() to view all of the possible variables for analysis

# load_variables works for both decennial census and American Community Survey databases

acs18_variable_list <- load_variables(year = 2018, dataset = "acs5", cache =
TRUE)

acs18p_variable_list <- load_variables(year = 2018, dataset = "acs5/profile",
cache = TRUE)

# Maybe write out the data frame to the desktop, for easier in use in Excel?

write.csv(acs18_variable_list,'acs18_variable_list.csv', row.names=FALSE)

View(acs18_variable_list)

As of this summer 2020, tidycensus can be used to extract the “base” and “collapsed”
tables for all years of the ACS, from 2005 through 2018 “single year” databases; the
five-year ACS databases starting with 2005/09 through to 2014/18; and the decennial census
files for 2010, 2000 and 1990. For the decennial censuses, databases include the SF1
(Summary File #1) for 1990, 2000 and 2010; and the SF3 (Summary File #3) for 1990 and
2000. (There was no long form census in the 2010 Census, so, thus no long-form-based SF3
data for 2010!)

(I have yet to explore how to pull data from the decennial censuses using tidycensus, and
would be grateful to hear news of successes/failures.)

A word of warning: R is very case sensitive. Something like View(acs18_variable_list) will
work okay, but view(acs18_variable_list) will not work!!

That’s the end of Step #0… Setting up Tidycensus!

Chuck Purvis, Hayward, California
Retired Person (formerly of the Metropolitan Transportation Commission, San Francisco,
California)\