Dear John –


Here is my 2007 document about LEHD On the Map.

We have not had time to re-evaluate new data  (QCEW 2005 and 2006 )  released in the last couple of months from On the Map.

It is VERY important to understand that the LEHD On the Map includes “covered” employment only.  This means workers who are covered by unemployment insurance. 


To answer John’s specific questions:

1.  The workplace location is taken from the QCEW (aka ES 202 files).  The quality of the workplace location relies on the quality of the QCEW files that EACH States provides to the CB, particularly the quality and completeness of voluntary Multiple Site Businesses.  For workers who work for a business with multiple sites, they are assigned to a work location based on data from Minnesota where the state requires that EACH employee be linked to a specific site, not just the employer.   In other states, businesses are not required to link individual employees to a specific site.  For example, a large grocery business with 10 stores in one county, may or may not report all 10 store locations, and the assignment of an employee to one of the 10 stores uses the results from Minnesota to do the assignment.   Some states expend more staff resources to research businesses to build more complete files of multiple sites, even when the business itself does not report those locations. 


2.  Residence location is taken from federal administrative records, and then moved slightly to protect confidentiality.  Most residences are kept within the same census tract.  Fredrik Andersson presented a paper at ASA in 2007 (?) on the proximity of the synthetic data to the real data. 


3.  Social Security Number (SSN) is the variable that ties the different datasets together, but the SSN is not kept on the files.  The actual SSN is changed into a unique ID.  And the data when reported in On the Map is not for a microdata record, but is grouped as synthetic block data. 


Here is the link to the Census Bureau’s documentation:
The paper I relied on earlier was: 
"LEHD Infrastructure Files and the Creation of the Quarterly Workforce Indicators" by J. Abowd; B.E. Stephens, L. Vilhuber, F. Andersson, K.L. McKinney, M. Roemer, and S. Woodcock, dated December 5, 2005 


There is an NCHRP 08-36 project, task 81 which is a small (“Quick Response”) research project to examine how ACS and LEHD On the Map could potentially be combined.   Cambridge Systematics is conducting the research, and the NCHRP contact is Nanda Srinivasan.


I don’t know why the word “synthetic” is raising so many red flags.   Transportation planners have long used FRATAR and IPF, which results in synthetic data.  Another example of synthetic data is where TRANSIMS is using a combination of Census 2000 block data with Census 2000 PUMS to create synthetic households for each block. 


I don’t think that synthetic data can summarily be called GOOD nor BAD.  Data users need to understanding the process and the data sources used in the synthesis process.    Because the sample size in ACS is MUCH smaller than the decennial census “long form” in 1990 or in 2000, we (the CTPP group) are talking much more about data synthesis for the ACS data, because the Census Bureau’s Disclosure Review Board has established more stringent rules about what can or can’t be released from the ACS for CTPP to protect individual confidentiality.


Hope this helps.


Elaine Murakami

FHWA Office of Planning (Wash DC)

206-220-4460 (in Seattle)




From: [] On Behalf Of John Hodges-Copple
Sent: Monday, October 06, 2008 5:39 AM
Subject: [CTPP] seeking guidance on worker flows from the local employmentdynamics On The Map data


Does anyone have a short, "plain English" explanation of the residence-to-workplace flows from this data and how it compares to the old long-form commuting data from the 2000 and earlier censuses (censi?).  I read "synthesized" data and little red flags go up.  Specifically, is this data based on actual residence and workplace data of real individuals (as with the Census), or are the residence and workplace locations from different data sources and the travel between the 2 synthesized in some way, as a travel demand model would create travel patterns between the 2?


Any guidance would be appreciated; my brief hunting through the documentation didn't give me the clear specifics I was hoping.




John Hodges-Copple, Planning Director
Triangle J Council of Governments
PO Box 12276
Research Triangle Park, NC  27709

----- Original Message -----

From: Paddock, Bob

Sent: Thursday, October 02, 2008 11:36 AM

Subject: RE: [CTPP] Need your thoughts on CTPP products using ACS standardtables



The tables currently found in the ACS have been sufficient for my needs and concerns.  The idea expressed by Nathan is intriguing (if I understand it correctly); by “multiple geographic units”, do we mean various MCDs, TAZs or the 20,000 population areas?

Concerning the “Journey to Work Trends”, I made use of that data primarily as comparative analysis for the Twin Cities region to other MSAs and as a template for a more detailed look at Minneapolis-St.Paul.  However, given the workload that appears to exists, I don’t believe that updating the report to include the 2005-07 information would be useful enough to spend additional time and resources ….at least for my specific needs and wants.



<hr size=2 width="100%" align=center tabIndex=-1>

From: [] On Behalf Of Murakami, Elaine
Sent: Wednesday, October 01, 2008 4:18 PM
Subject: [CTPP] Need your thoughts on CTPP products using ACS standard tables

Hi Everyone –

I bet you have a lot of questions about CTPP using the first 3 years of ACS and TAZs, but unfortunately, I can’t answer them yet!

Given the current uncertainty of the next CTPP (“custom tabulation”) using the ACS, we are moving forward to develop products using standard ACS products.  Some of you will recall that we created a series using the first 2005 ACS data products.  They are posted on both the FHWA web  and  on the AASHTO web

On December 9, 2008, the Census Bureau plans to release the first 3-year ACS products (surveys completed in 2005, 2006 and 2007).  The minimum population threshold is 20,000 for the 3-year products, compared to 65,000 population for the ACS 1-year products.  So, while the data is still “swiss cheese,” that is, geographic coverage has holes,  a lot more geographic units will be available.  The results are still subject to the Census Bureau rules of “collapsing and filtering” which means that sometimes the data have been suppressed and you will see an "N". 

We are now designing new profile sheets, in which we plan to include data from 2000 (using Census Summary File 3 and CTPP2000) and from 2005-2007 ACS.   Please let me know if you have any recommendations for specific tables to include (the data must be available in both 2000 and from the 2005-2007 ACS).    One recommendation from Nathan Erlbaum (NYS DOT)  is to create a spreadsheet macro that will sum up multiple geographic units and re-calculate the Margin of Error (using the materials on Page 96-98 in NCHRP Report 588).   

Also, I am wondering if there is any interest in an updated “Journey to Work Trends” report to include the 2005-2007 ACS results. This report was limited to metropolitan areas with population over 1 million, but had trend data including 1960, 1980, 1990 and 2000.  Because of  redefinitions of metropolitan areas by OMB, the data need to be accumulated from county records for historical comparability, which makes for quite a bit of work.   The last report used the 1999 definition, but the 2005-2007  ACS data will be reported using the 2007 OMB definitions (I think).  My question for you is:  is this report useful enough to spend time and resources on?   

Thanks in advance for your opinions.

Elaine Murakami

FHWA Office of Planning (Wash DC)

206-220-4460 (in Seattle)

<hr size=2 width="100%" align=center>

ctpp-news mailing list