Interesting stuff given the recent discussion about ACS and travel times. The full press release
can be found at
Today the U.S. Census Bureau will he a news conference to discuss new American Community Survey
data showing that Albany County residents have higher levels of education and household income than
the average resident of New York state. Other data showed that New York state residents spend more
time commuting to work than those of any other state. Data will highlight socio-economic and
housing characteristics for New York state, Albany County and
other areas of 250,000 population or more within the state.
Here is information I provided to Chandler Felt (original query).
Federal Highway Administration sponsored research conducted comparing the ACS (1999 -2001) test data to the 2000 decennial data. We used 6 counties, and later, compared tract level data for 2 counties. This research was conducted on-site at the CB offices in Suitland, with access to the microdata. (Wende Mix, formerly at Westat)
To summarize the file attached, the average travel times were significantly different between the ACS and the decennial. Generally speaking, the average travel time in Census 2000 are from .8 min to 1.8 minutes LONGER than the ACS reported times. As Phil notes, when examining the travel time distributions, the ACS curves are shifted to the left (shorter times) compared to decennial. (my file that has bar charts that show the distribution is too big for the listserv!)
Pima County ACS 25.2 Decennial 26.7 *
San Francisco 33.3 34.6 *
Broward, FL 28.7 30.5 *
Lake, IL 34.6 34.0 *
Flathead, MT 20.6 21.4
Bronx, NY 48.0 49.2 *
* = significantly diff
When examined spatially (by tract) and by mode, there was no pattern found in our limited research to explain the difference. We did not compare by month or season, but some of us suspect some seasonality differences. Another factor may be that the decennial is more likely to capture longer distance commutes. We have not yet been able to compare this, but have an NCHRP project that includes this question.
Other researchers believe that the use of field interviewers are resulting in overall higher quality data.
We agree that it is not wise to compare the "current" ACS results directly with decennial 2000, since it appears that methodological differences are making a significant difference in this variable.
FHWA Office of Planning
The following question was referred to me for comment and help, and I don't
recall seeing any discussion of this on the CTPP listserver. If you have any
thoughts on this, I will pass them along to the questioner. Thanks in
advance for any insight you may have.
Edward L. Hillsman, Ph.D. Washington State DOT
TDM Lead Researcher Public Transportation/Rail
tel 360.705.7887 310 Maple Park Avenue SE
fax 360.705.6862 P.O. Box 47387
hillsme(a)wsdot.wa.gov Olympia, WA 98504-7387 USA
My colleague and I have been analyzing 2000 Census data and 2002 ACS data on
commuting patterns, here in King County, WA with comparisons across the
nation. The average 2002 commute time to work, reported in ACS for King
County, was 25.0 minutes, down 1.5 minutes (5.7%) in the two years since the
2000 Census. Several comparable cities / counties across the US reported a
similar pattern. St Louis, Houston, Boston, Oakland, Atlanta, and Washington
DC all showed significant declines in average commute times over the two
year period (10% decline in Washington DC). San Diego and New York City
declined slightly. Of the ten metro counties we examined, only Cook County
(Chicago) went up (by 3.8%).
Our question is:
Is the widespread reduction in commute time a function of the nationwide
recession and decreasing employment levels, differences in the statistical
instruments used, or something else? It's hard to believe the recession
alone shows up that profoundly in commute times. The ACS question appears to
be worded the same as the decennial Census question, but of course the
sample is much smaller. We used the midpoint of the range the ACS provides.
Can anyone think of an intervening variable we may have overlooked? Thank
you for any insight you can provide!
I assume the most important thing MPO staffs can do with the draft CTPP
part 2 tables is to see if geocoding of at least the major employers was
accurate, after all the effort that went into "Work-up" back in 1999. In
lieu of any other "benchmarks" (like SF3 for Part 1 data) all I've done so
far is compare county totals to the county-to-county journey to work flows
that came out a few months back. (This state at least seemed to have the
same place of work totals in both files.)
Ohio DOT, Office of Technical Services
1980 W. Broad Street, Columbus, OH 43223
Phone: 614-644-6796, Fax: 614-752-8646
"When the going gets weird, the weird turn pro." Hunter S. Thompson
In using the CTPP Part 2 tables I have found some instances where there is
no row/entry for a particular TAZ. For example, I was looking for data on
occupation by means of transportation to work and there is no entry for one
of the TAZs in the town I am looking at. Has anybody else run into similar
situations? My guess is that this is how the tables indicate "no data" or
zero values, but I would appreciate anybody being able to confirm this (or,
if my guess is wrong, let me know why rows for particular TAZs may be
missing from tables).
Genesee Transportation Council
Apologies for cross-posting!
Abstracts are due March 1, 2004. TRB conference on research on Women's Issues in Transportation, is scheduled for November 18-20, 2004, in Chicago. Conference website is http://www.TRB.org/Conferences/Women
Papers on commuting using Census data (SF3, CTPP and PUMS), particularly looking at trends from 1980, 1990 and 2000 would be welcome.
FHWA Office of Planning
Ed and CTPP-News:
The rounding of values inside the CTPP is, right now, a modest, annoying data processing issue. As professional data analysts, we are always on the lookout to make sure our numbers "add up" so that we're not missing anything. Rounding should be a privilege of the data analyst, AFTER all of the precise number-crunching has been performed. So, I want to make sure in my data analysis that the year 2000 total population of my region is ALWAYS 6,783,760. IF IT'S DIFFERENT, THEN I MADE A MISTAKE THAT I HAVE TO CORRECT. After I get the precise number, then I can do the rounding off to my heart's desire, that is, 6.8 million persons, or 7 million, whatever. It is annoying, frustrating, an inconvenience, and a pain to NOT have the numbers add up!
The Census Bureau's use of rounding is an attempt at "disclosure avoidance" that is, to foil attempts of the data analyst to "reverse engineer" the precise name, address, and characteristics of individuals and their households. I frankly do not believe that rounding is the best method for ensuring disclosure avoidance. I believe other mathematical techniques to "dither" or randomize the reported data would be more useful, in terms of disclosure avoidance, and useful to the analyst, in terms of removing all of the rounding errors inherent in the current CTPP. My recommendation to the Census Bureau: do the right thing and hire mathematicians to find best methods to a) protect the identity of respondents; and b) to make things easier for the data user.
Frankly, you can use American Factfinder to enter your home address, and get the block-level population of persons on your block by race, by sex and by age. So then how is the Census Bureau providing "disclosure avoidance" for standard products like Summary File #1? If the Census Bureau had implemented rounding on standard census products such as SF1, SF2, SF3, and SF4 then there would have been a riot among the data users, Congress would have intervened, and the Census Bureau would be backtracking as fast as you could say Appropriations Committee.
Right now we have two classes of Census Bureau products: "first class" products such as the summary files and the Public Use Microdata Sample where there is (thank goodness!) NO rounding at all. (There are data thresholds in SF2 and SF4, but that's another matter.) The "second class" products are the CTPP and the EEO files, where there is rounding of data to the nearest, 10, 15, 20, etc. Perhaps it is the intent of the Census Bureau to implement rounding in future releases of "regular" Census Bureau products, such as American Community Survey and 2010 Census short form data. That would be a big mistake.
The rounding of data in the CTPP guarantees loss of productivity: the data analyst will lose productivity in terms of always second-guessing the data processing steps (is a tract or zone missing? are there problems in my computer code?); and the data analyst will lose time in explaining to data users: WHY THE NUMBERS DO NOT ADD UP!
Try explaining why: 10 + 10 + 10 + 10 = 50 !!!
I have spent too much time over the past 20+ years explaining the difference between commuters and "home-based work" trips; and "workers at work" and "total employment." Now, we can be guaranteed to spend a heck of a lot more time explaining "why don't the numbers add up?" (Does anybody have the home phone numbers for Census Bureau management?)
Here's a real life example using the CTPP Part 2 data. Let's say my boss asks as simple question: "How many transit commuters are at work in the Bay Area?" Using the Part 2 data, I am able to provide my boss 15 different answers!
The short answer is "320 thousand."
The long answer:
In Table 2-2 (Means18) there are five categories of "transit" that need to be summed to derive "total transit. In table 2-12 (Means11) there are three categories of "transit" that need to be summed; and in Table 2-27 (Means8) there are two categories of transit that need to be summed to get total transit. (There are no "Means5" tables in CTPP2 where "transit" is one, and only one category.)
And there are multiple summary levels where one can derive a regional total count of transit commuters, including TAZ, block group, tract, county and the "MPO Summary Level". (Also, the county-place-remainder, the place-remainder-tract, and MSA/CMSA summary levels can be used to extract more "different answers")
So, the following table illustrates the range of "regional transit commuters" using the three available means-of-transportation tables, and five of the different summary levels available in CTPP:
Table 2-2 Table 2-12 Table 2-27
N SUMLEV (Transit=5 cats) (Transit=3 cats) (Transit=2 cats)
4,031 TAZ 319,435 319,553 319,600
4,384 Blk Grp 319,433 319,521 319,541
1,403 Tract 319,717 319,780 319,836
9 County 320,116 320,129 320,125
1 MPO 320,125 320,120 320,120
What this tells us is that the number of "regional transit commuters" working in the Bay Area is somewhere between 320,118 and 320,122, and it's rounded to 320,120. All of the other numbers are subject to a modest degree of rounding error.
AND THERE IS A PATTERN!!! There is data "leakage" the more one aggregates from lower levels of geography, and from greater number of subcategories (e.g., aggregating from the five transit sub-groups versus the two transit sub-groups.) This data leakage is hardly statistically significant. It is, however, annoying.
My recommendation to users of CTPP data (Part 1 and Part 2):
1. Obtain your "regional control totals" or "state control totals" from the most geographically aggregate summary levels, e.g., SUMLEV=040 for states, and SUMLEV=930 for MPOs.
2. Avoid aggregating (summing together) your geographies whenever and wherever possible.
3. Avoid aggregating categories (e.g., detailed household income versus grouped household income; means of transportation) whenever and wherever possible. For example, to get the least affected count of 3-plus carpools, use tables based on Means of Transportation (8 categories.)
4. Sum as few categories as possible to derive aggregated measures such as "total transit." For "total transit" use CTPP Part 2, Table 27, where you are only summing bus/trolleybus to streetcar/subway/railroad/ferry.
5. Adjust (de-round, un-round) as you see fit. Use SF3 or PUMS to provide control totals to adjust the CTPP Part 1 data.
6. Develop a sense of humor. As I see it, this data rounding is a real joke. Don't take these data issues too seriously. And it's kind of funny that the numbers don't add up. Or, as they say: "close enough for government work."
cheers and good luck,
Chuck Purvis, MTC
Question: Using the CTPP, 10 + 10 + 10 + 10 = ?
f) Any of the Above
>>> ed christopher <edc(a)berwyned.com> 02/12/04 12:01PM >>>
The rounding within the CTPP data can play heck with doing any data
analysis. In the Chicago Central Area there are 155 individual TAZs.
If you take a simple table from Part 2, say mode to work by sex, some
interesting things happen. If you sum the total workers using the
"total" field you get 631,999. This becomes an important number because
people like to know the total. However, when you sum all the modes by
zone you get 631,883. This is not a big deal except if you want to show
drive alone, carpool, transit and other with their modal share
percents. In this region, some of us like to see the actual numbers
along with the percents. Logic would say to use the 631,883 when
calculating the percentages but then that means the sum of the totals
(which we know to be the better number because row rounding was applied
to the tables) 631,999 gets tossed aside. One could get creative and
distribute the 116 workers in some weighted fashion which would not
likely affect any percentages but then the next guy who comes along
using the CTPP data and software would get different numbers and we are
back splitting hairs over who got what number from where.
Are others finding the issue of rounded numbers a bit frustrating,
especially when it comes to aggregating TAZs?
I suppose one way to deal with this would be to simply round everything
to the nearest 100―Even the 1980 and 1990 data like it appears Chuck
Purvis, MTC, has done with his Commuting to Downtown trend analysis.
As of this morning, CTPP 2000 Part 2 is all in the mail.
This week we mailed:
For MPOs and State DOTs: In about a week, eveybody should have their
package. If you do not receive anything by then, please let me know.
For those who are receiving your CTPP Part 2's we now have the the
handout and background information that came with your data posted at
the direct link to the information is
Federal Highway Administration
19900 Governors Drive
Olympia Fields, Illinois 60461
708-283-3534 (V) 708-574-8131 (cell)