Some covariance terms are positive and some are negative in most cases. How the number of
variables affect the estimated MOE for the sum depends on how these positive and negative
covariance terms add up.
-----Original Message-----
From: ctpp-news-bounces(a)chrispy.net [mailto:ctpp-news-bounces@chrispy.net] On Behalf Of
Steven Farber
Sent: Thursday, March 14, 2013 3:13 PM
To: ctpp-news(a)chrispy.net
Subject: Re: [CTPP] Working with County flow data
Ed, Slide 39 (and a few preceding) lay the issue out on the table. It does indeed have to
do with covariance. The true equation for the SE of a sum of random errors must include
their covariances in the sum. I suppose when the number of variables being added together
is small, the omission of covariances does not lead to a large difference in MOEs. But the
estimation of MOEs gets worse and worse, the more covariances that you leave out. Since
covariances are pairwise, the number being left out grows exponentially with respect to
the number of variables being added together.
Nancy's email below:
We were told to limit our aggregations to four items by Mark Asiala from the ACS staff at
our Annual California State Data Center meeting. It is mentioned in his presentation
slides online at
http://www.dof.ca.gov/research/demographic/state_census_data_center/meeting…
Slide 39
There is other information in this presentation about the new sample frame in the 2011
survey which may be interesting.
Nancy Gemignani
California State Census Data Center
Demographic Research Unit
(916) 327-0103 ext 2550
Steven Farber, Ph.D
Assistant Professor
Department of Geography
University or Utah
http://stevenfarber.wordpress.com
-----Original Message-----
From: ctpp-news-bounces(a)chrispy.net [mailto:ctpp-news-bounces@chrispy.net] On Behalf Of Ed
Christopher
Sent: March-14-13 12:57 PM
To: ctpp-news(a)chrispy.net
Subject: Re: [CTPP] Working with County flow data
Warning! This may ramble so if you do not care about the issue delete.
Steve I am looking for specific references to the "limit of 3". I know I have
heard this many times and in fact tested it myself. Using data from the Missouri State
Data Center I got Tract data for the modes people use to go work for my neighborhood
Tract. With the Missouri data they had published the total commuters calculated with a MOE
along with the total workers. I then went and pulled the 4 block groups for my
neighborhood from the census website. At the time the Missouri Data Center did not have
Block Group data published. I do not know if they have them now, I did not check. While I
could get the breakdown of the modes for the BGs, the table did not have the total
commuters as a subtotal with a corresponding MOE. I figured I could just calculate my own
adding up the 5 modes and do the calculation. Before I went off to do this the scientist
in me took over and I tested the formula on the tract data just to see if I could
replicate the published MOE for the total n!
umber of commuters. I could not do it. Fortunately, Liang Long came to my rescue and
suggested that I just take the Total number of workers and subtract those who work at home
(both of which have MOEs) and try that. It worked! I could replicate the published MOE.
What this did was prove that as you more variables to the mix the formula for calculating
the MOE breaks down.
For what I was doing I was able to find a way of only working with two variables but many
times you can not.
When I presented this at a transportation census conference in October of 2011 several
users in the "power users session" confirmed that they had heard that 3 was the
most variables you wanted to use at a time. I did find this on the census site that says
"limit the number of variables"
http://www.census.gov/acs/www/Downloads/data_documentation/Statistical_Test…
A few days ago I talked with Elaine Murakami about this and she had the perfect rule of
thumb for me. Since the whole MOE thing is just an approximation anyway "just take
the largest MOE in the string of numbers you are aggregating and use that". If you
think about it, this does make some mathematical and more importantly intuitive sense. I
wish we could get some statisticians to help out here. We need easy, quick to use
methods.
Steven Farber wrote:
I think I jumped the gun before when stating concerns
over exploding MOE's.
Going back to the New York State Data Center document, you'll notice that the MOE has
increased in absolute terms when summing over areas, but dropped in relative terms in
comparison to the sum.
So MOE has increased but the Coefficient of Variation has dropped. In other words, our
aggregated estimate is more precise than each of the smaller area estimates.
http://www.census.gov/acs/www/Downloads/handbooks/ACSResearch.pdf - Appendix 3 contains
all the calculations required.
Ed, do you recall where you saw that this type of calculation should be limited to 3
summands at a time?
Steven Farber, Ph.D
Assistant Professor
Department of Geography
University or Utah
http://stevenfarber.wordpress.com
-----Original Message-----
From: ctpp-news-bounces(a)chrispy.net
[mailto:ctpp-news-bounces@chrispy.net] On Behalf Of liang.long(a)dot.gov
Sent: March-12-13 10:18 AM
To: ctpp-news(a)chrispy.net
Subject: Re: [CTPP] Working with County flow data
I can see why Census doesn't recommend do more than three variables at a time. When
you add 17 counties together, you get a much bigger area with more households sampled. In
theory, you should get a smaller MOEs compared each individual county. But if you derive
MOEs from those 17 counties, you will get a much bigger MOEs, which is contradictory to
the theory.
________________________________________
From: ctpp-news-bounces(a)chrispy.net [ctpp-news-bounces(a)chrispy.net] on
behalf of Ed Christopher [edc(a)berwyned.com]
Sent: Tuesday, March 12, 2013 11:15 AM
To: ctpp-news(a)chrispy.net
Subject: Re: [CTPP] Working with County flow data
Thanks--I know the spread sheet allows you to recalculate MOEs for more than three
variables but I remember doing more than 3 a while back and I was getting some wild MOEs.
When I dug into it I found something in the Census compass reports that said not to do
more than three variables at a time. I was hoping that someone figured out a way around
this.
Ed C
On Mar 12, 2013, at 9:59 AM, "Hoctor Mulmat, Darlanne"
<Darlanne.Mulmat@sandag.org<mailto:Darlanne.Mulmat@sandag.org>> wrote:
The New York State Data Center developed a Statistical Calculations Menu that includes an
option for computing the margin of error for the sum of three or more estimates. See
attached.
Darlanne Hoctor Mulmat
Applied Research Division - Criminal Justice/Public Policy San Diego
Association of Governments
619-699-7326
From:
ctpp-news-bounces@chrispy.net<mailto:ctpp-news-bounces@chrispy.net>
[mailto:ctpp-news-bounces@chrispy.net] On Behalf Of
Ed.Christopher@dot.gov<mailto:Ed.Christopher@dot.gov>
Sent: Tuesday, March 12, 2013 6:57 AM
To: ctpp-news@chrispy.net<mailto:ctpp-news@chrispy.net>
Subject: [CTPP] Working with County flow data
Has anyone come up with some easy ways for collapsing and grouping counties together
using last week's county flow data and recalculating new MOEs. I have so many counties
that I want to group together that I am looking for a quick way that can handle
"lots" of counties. Another issue I am struggling with is that we are always
told not to group more than three variables at a time or the formulas for calculating the
new MOE do not really work. This is particularly troublesome especially if I am trying to
group 17 counties together. What it comes down to is 9 different calculations given that
I can only group 3 counties at a time together. Anyone figure out any short cuts or ways
around this short of disregarding the MOEs altogether? Given all the clustering that I am
looking at using the "cheat" sheets I am used to, I will be recalculating MOEs
for weeks.
Ed Christopher
<StatisticalCalculationsMenu.xls>
_______________________________________________
ctpp-news mailing list
ctpp-news@ryoko.chrispy.net<mailto:ctpp-news@ryoko.chrispy.net>
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
_______________________________________________
ctpp-news mailing list
ctpp-news(a)ryoko.chrispy.net
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
_______________________________________________
ctpp-news mailing list
ctpp-news(a)ryoko.chrispy.net
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
--
Ed Christopher
708-283-3534 (V)
708-574-8131 (cell)
FHWA RC-TST-PLN
4749 Lincoln Mall Drive, Suite 600
Matteson, IL 60443
_______________________________________________
ctpp-news mailing list
ctpp-news(a)ryoko.chrispy.net
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
_______________________________________________
ctpp-news mailing list
ctpp-news(a)ryoko.chrispy.net
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news