Here’s my first followup to my 7/16/2020 post on using tidycensus in a post-American Factfinder era.

Attached to this e-mail is a short text file (“r” suffix) that can be edited for your use.

Example #0. Setting up tidycensus.

This is an introduction to the use of the R-package tidycensus in extracting data from the US Census Bureau’s American Community Survey. I’m adding snippets of R code from my R-scripts, and attaching the full r-script to this message.

First things first: Acquaint yourself with the American Community Survey. What I would strongly recommend is to download and print out copies of the various ACS survey questionnaires. Know what was asked!

Decennial Census questionnaires:

https://www.census.gov/history/www/through_the_decades/questionnaires/

 American Community Survey questionnaires:

https://www.census.gov/programs-surveys/acs/methodology/questionnaire-archive.html

 

Next, I would recommend downloading the “table shells” from the Census Bureau’s website, and not rely on just on the tidycensus “load_variables” function. Get the table shells for all of the years: the ACS does change ever so often, and so do the tables! I find it useful to have part of my computer screen opened with the table shells visible in Excel.

 

ACS Table Shells:

https://www.census.gov/programs-surveys/acs/technical-documentation/table-shells.html

 I find it useful to have on hand a guide to the ACS table numbering scheme, so you know your “B” and “C” and “S” and “GCT” tables and the two-digit subject indicator (“08” – Journey-to-Work”).   https://censusreporter.org/topics/table-codes/

 Download and install the free software package R Studio. There are other YouTube videos you can watch about learning/installing R and R Studio, and I won’t cover those here.

https://rstudio.com/products/rstudio/download/#download

 

Launch R Studio. There are a few add-on packages that first need to be installed onto your computer, and then “loaded” into your working R session.

 

# Step 1 Install R packages. If installed in previous sessions, there is no need to re-install.

# You may need to install the packages "tidyr" and "sp" for "tidycensus" to be properly installed.

 

install.packages("tidyverse")

install.packages("tidycensus")

install.packages("janitor")

 

# Step 2: Load relevant libraries into each R-session.

 

library(tidyverse)

library(tidycensus)

library(janitor)

 

Acquire a Census API key from the Census Bureau. It’s free. It’s a 40 character string that identifies a unique API user and helps the Census Bureau improve their tools to access census data. They’ll e-mail you a key in no time at all.

https://www.census.gov/data/developers/updates/new-discovery-tool.html

https://api.census.gov/data/key_signup.html

 

Install your 40-character API key into your R “environment.” Just one time and no need to concern yourself ever again about this key.

# Step 3: Load the User's Census API Key.

# Census API Key was installed in previous sessions, so no need to re-install

# un-comment out the following statement with the user's API key.

# census_api_key("fortycharacterkeysentbyCensusBureau",install=TRUE)

 

The last section of this introduction relates to using the “load_variables” as a tool to assist in selecting various variables. I prefer to download the ACS Table Shells into Excel, and then have appropriate Table Shells opened, alongside R Studio, to aid me in variable selection and naming.

 

# Step 4: Explore the Data Variables using the load_variables() function

# Use the function load_variables() to view all of the possible variables for analysis

# load_variables works for both decennial census and American Community Survey databases

 

acs18_variable_list <- load_variables(year = 2018, dataset = "acs5", cache = TRUE)

acs18p_variable_list <- load_variables(year = 2018, dataset = "acs5/profile", cache = TRUE)

# Maybe write out the data frame to the desktop, for easier in use in Excel?

write.csv(acs18_variable_list,'acs18_variable_list.csv', row.names=FALSE)

 

View(acs18_variable_list)

 

As of this summer 2020, tidycensus can be used to extract the “base” and “collapsed” tables for all years of the ACS, from 2005 through 2018 “single year” databases; the five-year ACS databases starting with 2005/09 through to 2014/18; and the decennial census files for 2010, 2000 and 1990. For the decennial censuses, databases include the SF1 (Summary File #1) for 1990, 2000 and 2010; and the SF3 (Summary File #3) for 1990 and 2000. (There was no long form census in the 2010 Census, so, thus no long-form-based SF3 data for 2010!)

 

(I have yet to explore how to pull data from the decennial censuses using tidycensus, and would be grateful to hear news of successes/failures.)

 

A word of warning: R is very case sensitive. Something like View(acs18_variable_list) will work okay, but view(acs18_variable_list) will not work!!


That’s the end of Step #0… Setting up Tidycensus!


Chuck Purvis, Hayward, California
Retired Person (formerly of the Metropolitan Transportation Commission, San Francisco, California)\