TO: CTPP-News
FR: Chuck Purvis
I've taken the liberty and snipped a set of state data center listserv discussions
related to the census Demographic Profiles. These e-mail discussions are from the past
week, with most recent discussions first.
Basically, the weighted and expanded "long form" profile data show discrepancies
with the "short form" profile/PL94-171/SF-1 datasets, especially for VERY small
places less than 2,500 population.
NOTE that for the upcoming CTPP tabulations the minimum PLACE-LEVEL population threshhold
is 2,500 persons....(Correct me if I'm wrong.)
Cheers, Chuck Purvis, MTC
****************************************************************************************
Your point about the 2,500 person cutoff for places in 1990 is a good one.
Those of us with many small towns noticed similar differences in 1990. For
example, when I looked at Nebraska towns with less than 2,500 persons in
1990, the mean absolute percent error comparing STF3 to STF1 was 5.7%. In
2000, for towns less than 2,500, it was 6.7 %. However if I threw out 2 towns with
populations of 10 and 11, the MAPE becomes comparable at 5.9%.
The mean absolute deviation for 1990 was about 10 persons, and for 2000 it
was about 12 persons.
Based on this quick and dirty analysis, it looks like the estimation
problem may have worsened somewhat between 1990 and 2000, but it is not new to the 2000
census.
Jerry Deichert
Center for Public Affairs Research
University of Nebraska at Omaha
6001 Dodge Street
Omaha, NE 68182
****************************************************************************************
I wonder if this a result of the Bureau's use of Counties as the primary
sampling unit to determine the weights for population and housing counts on
the sample data. In 1990, they used areas (counties, MCDs, places, and
census tracts) over a relatively small population threshold (I think 2,500).
Leonard M. Gaines, Ph.D.
Research Specialist
Empire State Development
e-mail: lgaines(a)empire.state.ny.us
Empire State Development & NY State Data Center Web Sites:
http://www.empire.state.ny.us
****************************************************************************************
In New Jersey, most discrepancies between SF1 and SF3 were found in CDPs. The differences
between the 100% and sample population counts were as high as 38.1% in Diamond Beach CDP
(218 vs. 135) and 31.4% in Vista Center CDP (541 vs. 711).
Other than the CDPs, only 7 (out of 566) municipalities had 5% or more
differences in population or housing unit counts. Pine Valley Borough had
the largest discrepancies (20% in population, 66.7% in housing units). All
except one are tiny municipalities with less than 600 residents.
Sen-Yuan Wu
New Jersey Department of Labor
*************************************************************************************
Ken Darga has also documented DP discrepancies in Michigan. The Bureau is now aware of
these problems and looking into the cause.
Linda [Gage, California State Data Center]
***************************************************************************************
John -
Thank you for the work on this. I am going to forward to the FSCPE listserve as well. It
is interesting in that even a smaller community down the road from which is even smaller
looks numerically better. If anyone wants to see the article that came out on Searchlight
and a lesson on dealing with the press check
http://www.lvrj.com/lvrj_home/2002/Jul-29-Mon-2002/news/19282077.html
[from Jeff Hardcastle, University of Nevada, Reno]
*************************************************************************************
There appear to be some problems with this. We ran a test with our DP datasets, comparing
the 100% and sample counts for all places. The summary report can be viewed at
http://mcdc2.missouri.edu/pub/data/sf3prof/check_totpops.pdf . The biggest problem, in
terms of pct difference in the counts, is definitely in the very small places. There are
593 places in the country where the difference was 25% or more and 566 of these were for
places with 500 people or less.
The report also includes a listing of these 593 places, sorted by state and descending
Pct Difference. The winner of the worst sample estimate award is Blacksville CDP, Ga.
They had a 100% count of 4 people, but the sample estimate was 52.
John Blodgett
OSEDA - Office of Social & Economic Data Analysis
U. of Missouri Outreach and Extension
blodgettj(a)umsystem.edu
URL:
http://oseda.missouri.edu/jgb/
*************************************************************************************
I am not sure if this has happened to states for the Demographic
Profiles that include SF1 and SF3 data but in Nevada's case there are
serious problems that suggest that the whole set of profiles needs to be
reviewed for errors. These errors appear to be more than standard
sampling and response errors. During a quick review counties look
better than places however it appears that there may be geocoding
errores in the sample data. There also appears to be differences in
what the sample data is weighted against.
The place that brought this to my attention was Searchlight NV where the
DP-1 pop is 576 and the housing unit count is 444. On DP-2 through 4
the pop is 768 and the unit count is 595. A 33% and a 34% difference
respectively. The way I tumbled to this was that a reporter had seen
that Searchlight had no native Nevadans living there. He went to
Searchlight and interviewed people and found that most of them if not
all were natives. (Searchlight is an old mining down south of Las Vegas
and in the middle of very open country.)
Is this kind of error being found elsewhere?
Jeff Hardcastle
(775) 784-6353 Phone
(775) 784-4337 Fax
jhardcas(a)unr.edu e-mail
"shifts happen"
http://publicconversations.org/
*************************************************************************************