Dear John -
Here is my 2007 document about LEHD On the Map.
We have not had time to re-evaluate new data (QCEW 2005 and 2006 )
released in the last couple of months from On the Map.
It is VERY important to understand that the LEHD On the Map includes
"covered" employment only. This means workers who are covered by
To answer John's specific questions:
1. The workplace location is taken from the QCEW (aka ES 202 files).
The quality of the workplace location relies on the quality of the QCEW
files that EACH States provides to the CB, particularly the quality and
completeness of voluntary Multiple Site Businesses. For workers who
work for a business with multiple sites, they are assigned to a work
location based on data from Minnesota where the state requires that EACH
employee be linked to a specific site, not just the employer. In other
states, businesses are not required to link individual employees to a
specific site. For example, a large grocery business with 10 stores in
one county, may or may not report all 10 store locations, and the
assignment of an employee to one of the 10 stores uses the results from
Minnesota to do the assignment. Some states expend more staff
resources to research businesses to build more complete files of
multiple sites, even when the business itself does not report those
2. Residence location is taken from federal administrative records, and
then moved slightly to protect confidentiality. Most residences are
kept within the same census tract. Fredrik Andersson presented a paper
at ASA in 2007 (?) on the proximity of the synthetic data to the real
3. Social Security Number (SSN) is the variable that ties the different
datasets together, but the SSN is not kept on the files. The actual SSN
is changed into a unique ID. And the data when reported in On the Map
is not for a microdata record, but is grouped as synthetic block data.
Here is the link to the Census Bureau's documentation:
The paper I relied on earlier was:
"LEHD Infrastructure Files and the Creation of the Quarterly Workforce
Indicators" by J. Abowd; B.E. Stephens, L. Vilhuber, F. Andersson, K.L.
McKinney, M. Roemer, and S. Woodcock, dated December 5, 2005
There is an NCHRP 08-36 project, task 81 which is a small ("Quick
Response") research project to examine how ACS and LEHD On the Map could
potentially be combined. Cambridge Systematics is conducting the
research, and the NCHRP contact is Nanda Srinivasan.
I don't know why the word "synthetic" is raising so many red flags.
Transportation planners have long used FRATAR and IPF, which results in
synthetic data. Another example of synthetic data is where TRANSIMS is
using a combination of Census 2000 block data with Census 2000 PUMS to
create synthetic households for each block.
I don't think that synthetic data can summarily be called GOOD nor BAD.
Data users need to understanding the process and the data sources used
in the synthesis process. Because the sample size in ACS is MUCH
smaller than the decennial census "long form" in 1990 or in 2000, we
(the CTPP group) are talking much more about data synthesis for the ACS
data, because the Census Bureau's Disclosure Review Board has
established more stringent rules about what can or can't be released
from the ACS for CTPP to protect individual confidentiality.
Hope this helps.
FHWA Office of Planning (Wash DC)
206-220-4460 (in Seattle)
[mailto:firstname.lastname@example.org] On Behalf Of John Hodges-Copple
Sent: Monday, October 06, 2008 5:39 AM
Subject: [CTPP] seeking guidance on worker flows from the local
employmentdynamics On The Map data
Does anyone have a short, "plain English" explanation of the
residence-to-workplace flows from this data and how it compares to the
old long-form commuting data from the 2000 and earlier censuses
(censi?). I read "synthesized" data and little red flags go up.
Specifically, is this data based on actual residence and workplace data
of real individuals (as with the Census), or are the residence and
workplace locations from different data sources and the travel between
the 2 synthesized in some way, as a travel demand model would create
travel patterns between the 2?
Any guidance would be appreciated; my brief hunting through the
documentation didn't give me the clear specifics I was hoping.
John Hodges-Copple, Planning Director
Triangle J Council of Governments
PO Box 12276
Research Triangle Park, NC 27709
----- Original Message -----
From: Paddock, Bob <mailto:email@example.com>
Sent: Thursday, October 02, 2008 11:36 AM
Subject: RE: [CTPP] Need your thoughts on CTPP products using
The tables currently found in the ACS have been sufficient for
my needs and concerns. The idea expressed by Nathan is intriguing (if I
understand it correctly); by "multiple geographic units", do we mean
various MCDs, TAZs or the 20,000 population areas?
Concerning the "Journey to Work Trends", I made use of that data
primarily as comparative analysis for the Twin Cities region to other
MSAs and as a template for a more detailed look at Minneapolis-St.Paul.
However, given the workload that appears to exists, I don't believe that
updating the report to include the 2005-07 information would be useful
enough to spend additional time and resources ....at least for my
specific needs and wants.
<hr size=2 width="100%" align=center tabIndex=-1>
[mailto:firstname.lastname@example.org] On Behalf Of Murakami, Elaine
Sent: Wednesday, October 01, 2008 4:18 PM
Subject: [CTPP] Need your thoughts on CTPP products using ACS
Hi Everyone -
I bet you have a lot of questions about CTPP using the first 3
years of ACS and TAZs, but unfortunately, I can't answer them yet!
Given the current uncertainty of the next CTPP ("custom
tabulation") using the ACS, we are moving forward to develop products
using standard ACS products. Some of you will recall that we created a
series using the first 2005 ACS data products. They are posted on both
the FHWA web http://www.fhwa.dot.gov/planning/census/2005tpoverview.htm
and on the AASHTO web
On December 9, 2008, the Census Bureau plans to release the
first 3-year ACS products (surveys completed in 2005, 2006 and 2007).
The minimum population threshold is 20,000 for the 3-year products,
compared to 65,000 population for the ACS 1-year products. So, while
the data is still "swiss cheese," that is, geographic coverage has
holes, a lot more geographic units will be available. The results are
still subject to the Census Bureau rules of "collapsing and filtering"
which means that sometimes the data have been suppressed and you will
see an "N".
We are now designing new profile sheets, in which we plan to
include data from 2000 (using Census Summary File 3 and CTPP2000) and
from 2005-2007 ACS. Please let me know if you have any recommendations
for specific tables to include (the data must be available in both 2000
and from the 2005-2007 ACS). One recommendation from Nathan Erlbaum
(NYS DOT) is to create a spreadsheet macro that will sum up multiple
geographic units and re-calculate the Margin of Error (using the
materials on Page 96-98 in NCHRP Report 588).
Also, I am wondering if there is any interest in an updated
"Journey to Work Trends" report to include the 2005-2007 ACS results.
This report was limited to
metropolitan areas with population over 1 million, but had trend data
including 1960, 1980, 1990 and 2000. Because of redefinitions of
metropolitan areas by OMB, the data need to be accumulated from county
records for historical comparability, which makes for quite a bit of
work. The last report used the 1999 definition, but the 2005-2007 ACS
data will be reported using the 2007 OMB definitions (I think). My
question for you is: is this report useful enough to spend time and
Thanks in advance for your opinions.
FHWA Office of Planning (Wash DC)
206-220-4460 (in Seattle)
<hr size=2 width="100%" align=center>
ctpp-news mailing list