Hi Ken:
I read your excellent analysis of Census 2000 Part III tract-level,
county-level and state-level data sets. Your findings in the detailed
spreadsheet for 12 counties in Texas are not unique to Texas. According
to my evaluation of CTPP 2000 data for the Delaware Valley Region, the
errors and completeness of census data stem mainly from rounding and
disclosure threshold rules. The 2000 rounding and disclosure threshold
rules resulted in 3 and 62 percent loss in the worker flow at the TAZ
level, respectively. As you know, Table 3-06, which shows TAZ-to-TAZ
worker flow by means of transportation to work became totally useless
due to suppressed flow of workers.
It is simply not possible to have synthesized Part III dataset at the
TAZ level. We tried to develop a rational and convenient method to do
this but we could not and thus were unable to use TAZ-to-TAZ data by
means of transportation to work.
Relaxation of the disclosure rules would not solve the problem. If the
2000 disclosure threshold were decreased from 3 unweighted workers to 1,
the damage to the TAZ-to-TAZ worker flow would be significant still and
the data would be useless. Based on this evaluation, I recommended last
year not to use any disclosure threshold in the future. There was no
disclosure threshold in the 1990 census and no one complained because of
disclosure. Imputation, swapping and data rounding were used in the 1990
census to protect confidentiality and were sufficient to avoid
disclosure.
The "super TAZ" concept will not work in the northeast states, as well
as Michigan, Wisconsin and Minnesota because these states have many
small minor civil divisions, which consist of only one TAZ. Such small
municipalities are interested in their own data. Also, data from "super
TAZs" will not be useful for planning small transportation projects such
as subway stations, highway interchanges, park and ride lots and local
bus routes.
Finally, I should state that the margin of error in the ACS tabulation
will be larger than that found in Census 2000. The CB is saying now that
the 2005 ACS will provide data for areas of 65,000 population and larger
because the sample size was increased from that of 2004. Yet the 2005
ACS sample was not large enough to produce a table for all means of
transportation to work and a table for workers by place of work and by
industrial sector. This information was suppressed completely for
Gloucester County, NJ (277,000 population), not 65,000 as promised by
the CB. I know that Ed Christopher, Elaine Murakami, Jonette Kreideweis,
Dave Clawson, Alan Pisarski, Nanda Srinivansan and others are aware of
these CTPP issues. Hopefully, they will be able to resolve these
problems and produce accurate CTPP data in the future.
Thabet Zakaria
Deputy Director, Technical Services
Delaware Valley Regional Planning Commission
Philadelphia, PA 19106
Phone: 215-238-2885
Email: tzakaria(a)dvrpc.org
Fax: 215-592-9125
-----Original Message-----
From: ctpp-news-bounces(a)chrispy.net
[mailto:ctpp-news-bounces@chrispy.net] On Behalf Of Ken Cervenka
Sent: Tuesday, October 03, 2006 2:17 PM
To: ctpp-news maillist
Subject: RE: [CTPP] CTPP Discussion Issues
Ed,
Thanks for the additional info. The issues raised in the draft email I
was working on relate to what you just said: 1) the idea of larger TAZs
to limit disclosure problems; and 2) the idea of data synthesis to still
get very small-area data. So my apologies for redundancy in addressing
the same thing below, but in different words.
*********************************************************************
Ed said at the end of his email to post questions/comments to the
listserv, so...
I do indeed hope AASHTO will approve the new CTPP Pooled Fund
initiative. In spite of the difficulties "us MPOs" and other
organizations have with uncertainties on the accuracy of Census products
(and the ACS and CTPP 2000 comparisons people have been making are
pointing out some important issues), and the impacts of disclosure rules
applied to journey-to-work tables, the "long form" questions are
nevertheless still important to the transportation planning process. As
others before me have said (I recall Alan Pisarski giving a speech or
two or three on this topic), the word "data" is not particularly
glamorous to people outside the profession (and oftentimes rather boring
to those within the profession), but yet represents the foundation to an
informed (and democratic) decision-making process.
So with those sincere accolades on the underlying importance of CTPP
data--and hopes for AASHTO pooled funding--stated, now on to the
technical issues.
The current disclosure rules have been a major hindrance to "good"
small-area transportation planning in which we want more than simply
place-of-residence and place-of-work information: now I understand why
the Census Bureau must always retain procedures that maintain
confidentiality of individuals, but I wonder if there isn't more to be
checked and confirmed, just to be sure that the "rules" in place are not
being too restrictive. The impacts of the restrictions have been
discussed many times before, but can be easily seen by working with the
Census 2000 Part III tract-level, county-level, and state-level datasets
available from:
http://www.transtats.bts.gov
(note: when you are at this location, just type in CTPP 2000 Part III
in the "search this site" box to get to the appropriate downloading
page).
The attached spreadsheet shows some statistics for 10 different states,
in which I summarized the Tract-to-Tract data in the State, for selected
variables. Since not everyone's email is set up to receive attachments,
here is the gist of what's shown on the spreadsheet:
-- Question T301C1 represents total workers, which for these 10 states
is 56,157,132 workers.
-- Question T302C1_1 represents all workers residing in households that
are definable by their "vehicles available in household" and seven
"Means of Transportation" aggregated categories; the 10-state total is
55,643,556 workers, which is equal to 99.1% of the total workers (I
presume the very minor loss in workers is due to the "residing in
households" definition).
-- Question T305C1 represents all workers residing in households that
are definable by a household income category; the 10-state total is
65.7% of the total workers.
-- Question T306C1 represents all workers residing in households that
are definable by one of the 17 "Means of Transportation" detailed
categories; the 10-state total is 66.4% of the total workers.
Of equal significance are the variations of "completeness" for each of
these 10 selected states, e.g., for Question T306C1 there is a range of
56.0% for New York to 79.8% for Oregon. When confronted with these
"missing data" issues, I suspect some planners simply factor up the
available trip table so it appears to represent the full universe--but
even a cursory examination of county-level Part III trip tables versus
"Part III tract-level aggregations to county" trip tables will show that
the missing data is not equally represented across all categories. For
example, consider the 17 "Means of Transportation" data for Dallas
County: the tract-level aggregations represent 60% of the reported
county-level total for drive alone trips; 80% of the total for walk
trips; and 100% of the total for work-at-home trips. So if someone used
a simple factoring process across all modes in the tract-level datasets
to reach the full universe, the end result is that the actual number of
drive alone trips would be significantly under-represented and the
number of walk trips and work-at-home trips would be significantly
over-represented in the factored tract-to-tract data. [Note: if anyone
is interested, I have a detailed spreadsheet that shows these issues for
12 counties in Texas--but it is a rather messy spreadsheet that has not
been thoroughly checked, so I don't want to forward to people unless
they ask me via private email, and promise not to forward to a
listserv].
Since most TAZs (at least as defined by the MPOs) are generally a LOT
smaller than Census Tracts, the TAZ-to-TAZ trip tables from the CTPP
2000 effort wind up having even greater amounts of missing data for the
income-related and "detailed means of transportation" categories. Which
brings up two possibilities for future improvement (assuming there will
be Part III datasets in the future at anything other than a County or
PUMA level):
1. I wonder if there might be opportunities to someday have two sets of
TAZ-to-TAZ files: one dataset would have the same "missing records"
limitations we currently see (and agonize about), and the other dataset
would be based on a process that uses some of the held-back information
to "synthesize" the missing data to develop a fully-complete TAZ-to-TAZ
table. Now independent researchers can already use various "Iterative
Proportional Fitting" or similar imputation procedures to deal with
missing data issues, but I am concerned that their techniques will not
be as good as what the Census Bureau staff could do, since the CB staff
would have access to more of the underlying raw data that would (or at
least could/should) result in a "better" synthesis process. Now I can
understand the cost implications and potential confusions of two sets of
"official" TAZ-to-TAZ data, but this would give us end-user planners the
greatest number of choices for making decisions about future land use
and capacity changes. Plus, by making the "missing records" tables
available, this would still give agencies an opportunity to conduct (at
their own expense) their own imputation process that could then be
compared against the CB's synthesized tables.
2. If it is simply not possible to have a synthesized Part III dataset
at the TAZ level--and there is not much hope for any significant
relaxation of the disclosure rules--then maybe we need to pay great
attention to the "Super TAZ" concept, in which each TAZ is carefully
defined so that it is "just large enough" to reduce the more serious
confidentiality issues, while still being useful for planning. An
example would be a typical CBD: this is MUCH smaller than a County or
PUMA, but is often composed of many existing TAZs and Census Tracts. So
instead of having a CBD represented by 25 TAZs, maybe it could be
divided into just four or five TAZs--or even just one TAZ, if that's
what it takes to keep the understandable confidentiality issues from
over-taking our desires for a more democratic decision-making process
that is based on understanding the world we live in.
Ken Cervenka
North Central Texas COG
(Dallas-Fort Worth MPO)
Kcervenka(a)nctcog.org
*************************************************
-----Original Message-----
From: ctpp-news-bounces(a)chrispy.net
[mailto:ctpp-news-bounces@chrispy.net] On Behalf Of ed christopher
Sent: Tuesday, October 03, 2006 12:35 PM
Cc: ctpp-news maillist
Subject: Re: [CTPP] CTPP Discussion Issues
Ken--the short answer is yes; Part 1, part 2 and Part 3 (flow data).
With
that said there will be some caveats and of course of lots of issues to
be
discussed, debated, researched and decided upon. In hearing from users
and
those active on the AASHTO SCOP Data Working Group people have been very
clear that they want small area TAZ to TAZ flow data. Under the
proposed
pooled-fund, that would happen but certain concessions would have to be
made
to accommodate the Census Bureau's Disclosure Review Board concerns.
One
very practical notion on the table is to have a larger geography (call
it a
super TAZ for now) where we would have all our data with no suppression
or
other disclosure rules. From that we would extend a more traditional
data
package for the smallest zones (traditional TAZs) that would likely be
synthesized. Some spot research is going on in this vein and the
proposed
pooled-fund calls for more as well as an NCHRP 8-36 proposal that we are
hopeful will get funded. For now before we get too far a field I think
it
is critical to see that the pooled-fund comes to fruition.
Ken Cervenka wrote:
Ed (or anybody)
Could you please confirm that when you talk about TAZ data, you are
referring to not simply Place-of-Residence (Part I) and Place-of-Work
(Part II) summaries at the TAZ level, but the possibility for
TAZ-to-TAZ
(Part III) datasets? If there is still hope for
future TAZ-to-TAZ (or
super-TAZ to super-TAZ) dataset deliveries, I have lots to say on the
subject, that I am willing to put out on this listserv.
Ken C.
-----Original Message-----
From: ctpp-news-bounces(a)chrispy.net
[mailto:ctpp-news-bounces@chrispy.net] On Behalf Of ed christopher
Sent: Monday, October 02, 2006 10:19 AM
To: ctpp-news maillist
Subject: [CTPP] CTPP Discussion Issues
During the past few months I have received several questions regarding
the need for small area, Traffic Analysis Zone (TAZ), data from the
American Community Survey (ACS) so I thought I would provide an
update.
GENERAL UPDATE ON CTPP
There are two groups that are working on census data needs for the
transportation community.
First there is the long-standing CTPP Working Group that meets monthly
and has been responsible for the content of the 1980, 1990 and 2000
data
packages. Although the precise membership of this
working group has
changed over time it has generally been made up of US DOT staff,
Census
Bureau (CB) staff, Dave Clawson from AASHTO, and
members of
Transportation Research Board (TRB) committees. The work of the CTPP
Working Group tends to focus on the highly technical aspects of the
data. Since 1997, this group has met at least once a month and has
been
chaired by Elaine Murakami of FHWA.
Working in concert with the CTPP Working Group, the AASHTO Standing
Committee on Planning (SCOP) last August (2005) initiated a broader
based Committee called the SCOP Census Work Group. Jonette Kreideweis
of the Minnesota DOT chairs this group. Its main focus has been on
issues that transportation planners need to know to use the ACS and it
has been instrumental in recognizing that a "family" of new data
products will be needed. In June the Work Group proposed a
pooled-fund
project that includes data products, research,
training, and technical
support. The pooled-fund is currently before SCOP and I hope that it
will be approved by AASHTO in October. The pooled-fund builds upon
the
experience gained from the 1990 and 2000 pooled-fund
projects and
details on it can be found on the TRB Subcommittee on Census Data
website at
http://trbcensus.com/SCOP/
TAZ DEFINITION for ACS
1. Will MPOs and State DOTs be asked to submit new TAZs?
Assuming that a new CTPP pooled-fund is approved by AASHTO, there will
be an opportunity to define new TAZs for CTPP data products.
Questions
to be answered revolve around how many different TAZ
systems should
there be, the cost for developing those systems and the mechanical
process for submitting them. To help define TAZs, discussions with
the
CB's Geography Division are underway.
2. How will new TAZs be submitted to the Census Bureau (CB) and added
to TIGER?
The CB has a contract with M-cubed and its subcontract Caliper
Corporation for software development to support the "Participant
Statistical Areas Program" (PSAP). The PSAP includes the tract and
block group definition process. The software being developed for this
program can be modified to accommodate TAZ, SuperTAZ or any other
geographic units that the transportation planning community would like
to have. As a result, it makes sense to have tract, block group, and
the TAZ definition efforts be a coordinated process. Of course the
development of any TAZs are premised on and would be paid for by the
pooled-fund.
3. When will the Census Bureau (CB) use the new TAZ definitions to
tabulate ACS data for CTPP?
Since the CB PSAP to define track and block groups is focused on the
2010 regular census, tabulations of ACS data issued in 2010 and after
would follow the new geographies. Any earlier products would use
existing 2000 geography. For example, the pooled-fund calls for
small-area tabulations from ACS data using a required 5-year period --
2005, '06, '07, '08 and '09. The data would not be released until
2010-2011 and we have been assured that it would follow 2010
geography.
However, for our first 3-year product (2005, '06
and '07), we have to
use 2000 geographic units and meet the 20,000 population threshold per
zone.
4. What is the history of TAZ definition for CTPP?
For the 1980 and 1990 data tabulations TAZs were restricted to 6
characters and only one set of TAZs could be defined per region. The
data assigned to these TAZs represented an equivalency process where
MPOs were asked to let the CB know which blocks to assign to which
TAZ.
Blocks could not be split and the TAZs really became
an aggregation of
blocks and block groups.
In 2000, a major improvement was made where the TAZs were defined
early
and placed in the CB TIGER file. With the TAZs in
TIGER the CB was
then
able to put the actual data for the area the TAZ
represented in the
predetermined TAZs. For the first time the "urban" TAZs became a
unique
tabulation geography and not just an aggregation of
Census
geographical
units. While this new process was adopted for TAZs
the equivalency
process remained for those areas that crated state-level TAZs. This
was
due to size limitation of the TIGER record and other
technical
processing issues.
5. WHAT are some items to consider?
As we look toward the future and ACS related transportation data
products, further TIGER improvements are on the horizon. Inside the
CB
the Geography Division is undergoing a major overhaul
of its TIGER
file
data base. Conceptually, TIGER is moving to something
akin to a
relational data base and space limitations are a thing of the past.
Not
only can we consider adding more characters to the TAZ
field, we can
even talk about having different TAZ zone structures. Assuming the
AASHTO pooled-fund is approved, one of the first tasks will be to
focus
on defining TAZs for inclusion into the new TIGER
system. As a
result,
we need to focus on what we want in terms of TAZs.
Other items that are nearing the horizon concern the detail data
tables
to be included in the various products. The
pooled-fund is calling
for
a one, three and five year transportation data
product. However,
before
we work on the specific details of the product design
the pooled-fund
must first be approved and funded.
As Chair of the TRB Urban Data Committee, the sponsoring committee of
the Census Data Subcommittee, I hope to see this list serve used as a
forum to feed these discussions. Currently we have over 550
subscribers
to this list, most who are the active census data
users in their MPOs,
states, consulting firms and universities.
Finally, if you have a question about any of this, please post it to
the
list serve. We are at a critical time in terms of
getting the
information out to our various users. So please share your comments,
questions and experiences.
--
Ed Christopher
Resource Center Planning Team
Federal Highway Administration
19900 Governors Drive
Olympia Fields, Illinois 60461
708-283-3534 (V) 708-574-8131 (cell)
708-283-3501 (F)