When you talk about the number of variables, do you mean combinations or
sequential tables?
A combination is, e.g., means of travel to work by sex, by race/Hispanic
origin. That's 3 variables.
Sequential is. e.g., looking first at means of travel, then secondly at
travel time, then sex of workers, then race/Hispanic origin of workers.
That's four variables but they are not being cross-tabbed with one
another.
I agree that three is a reasonable limit in a *combination*. The data get
stretched so thin that even 3 may be too many.
In sequential, you should be able to look at as many variables as needed.
Overall, it's still all about the number of cases. The more cases, the
lower the MOE and vice versa.
Patty Becker
On Thu, Mar 14, 2013 at 3:13 PM, Steven Farber
<Steven.Farber(a)geog.utah.edu>wrote;wrote:
Ed, Slide 39 (and a few preceding) lay the issue out
on the table. It does
indeed have to do with covariance. The true equation for the SE of a sum of
random errors must include their covariances in the sum. I suppose when
the number of variables being added together is small, the omission of
covariances does not lead to a large difference in MOEs. But the estimation
of MOEs gets worse and worse, the more covariances that you leave out.
Since covariances are pairwise, the number being left out grows
exponentially with respect to the number of variables being added together.
Nancy's email below:
We were told to limit our aggregations to four items by Mark Asiala from
the ACS staff at our Annual California State Data Center meeting. It is
mentioned in his presentation slides online at
http://www.dof.ca.gov/research/demographic/state_census_data_center/meeting…
Slide 39
There is other information in this presentation about the new sample frame
in the 2011 survey which may be interesting.
Nancy Gemignani
California State Census Data Center
Demographic Research Unit
(916) 327-0103 ext 2550
Steven Farber, Ph.D
Assistant Professor
Department of Geography
University or Utah
http://stevenfarber.wordpress.com
-----Original Message-----
From: ctpp-news-bounces(a)chrispy.net [mailto:ctpp-news-bounces@chrispy.net]
On Behalf Of Ed Christopher
Sent: March-14-13 12:57 PM
To: ctpp-news(a)chrispy.net
Subject: Re: [CTPP] Working with County flow data
Warning! This may ramble so if you do not care about the issue delete.
Steve I am looking for specific references to the "limit of 3". I know I
have heard this many times and in fact tested it myself. Using data from
the Missouri State Data Center I got Tract data for the modes people use to
go work for my neighborhood Tract. With the Missouri data they had
published the total commuters calculated with a MOE along with the total
workers. I then went and pulled the 4 block groups for my neighborhood from
the census website. At the time the Missouri Data Center did not have Block
Group data published. I do not know if they have them now, I did not
check. While I could get the breakdown of the modes for the BGs, the table
did not have the total commuters as a subtotal with a corresponding MOE. I
figured I could just calculate my own adding up the 5 modes and do the
calculation. Before I went off to do this the scientist in me took over
and I tested the formula on the tract data just to see if I could replicate
the published MOE for the total n!
umber of commuters. I could not do it. Fortunately, Liang Long came to
my rescue and suggested that I just take the Total number of workers and
subtract those who work at home (both of which have MOEs) and try that. It
worked! I could replicate the published MOE. What this did was prove that
as you more variables to the mix the formula for calculating the MOE breaks
down.
For what I was doing I was able to find a way of only working with two
variables but many times you can not.
When I presented this at a transportation census conference in October of
2011 several users in the "power users session" confirmed that they had
heard that 3 was the most variables you wanted to use at a time. I did
find this on the census site that says "limit the number of variables"
http://www.census.gov/acs/www/Downloads/data_documentation/Statistical_Test…
A few days ago I talked with Elaine Murakami about this and she had the
perfect rule of thumb for me. Since the whole MOE thing is just an
approximation anyway "just take the largest MOE in the string of numbers
you are aggregating and use that". If you think about it, this does make
some mathematical and more importantly intuitive sense. I wish we could
get some statisticians to help out here. We need easy, quick to use methods.
Steven Farber wrote:
I think I jumped the gun before when stating
concerns over exploding
MOE's.
Going back to the New York State Data Center document, you'll notice
that the
MOE has increased in absolute terms when summing over areas, but
dropped in relative terms in comparison to the sum.
So MOE has increased but the Coefficient of Variation has dropped. In
other words,
our aggregated estimate is more precise than each of the
smaller area estimates.
Appendix 3
contains all the calculations required.
Ed, do you recall where you saw that this type of calculation should be
limited to
3 summands at a time?
Steven Farber, Ph.D
Assistant Professor
Department of Geography
University or Utah
http://stevenfarber.wordpress.com
-----Original Message-----
From: ctpp-news-bounces(a)chrispy.net
[mailto:ctpp-news-bounces@chrispy.net] On Behalf Of liang.long(a)dot.gov
Sent: March-12-13 10:18 AM
To: ctpp-news(a)chrispy.net
Subject: Re: [CTPP] Working with County flow data
I can see why Census doesn't recommend do more than three variables at a
time.
When you add 17 counties together, you get a much bigger area with
more households sampled. In theory, you should get a smaller MOEs compared
each individual county. But if you derive MOEs from those 17 counties, you
will get a much bigger MOEs, which is contradictory to the theory.
________________________________________
From: ctpp-news-bounces(a)chrispy.net [ctpp-news-bounces(a)chrispy.net] on
behalf of Ed Christopher [edc(a)berwyned.com]
Sent: Tuesday, March 12, 2013 11:15 AM
To: ctpp-news(a)chrispy.net
Subject: Re: [CTPP] Working with County flow data
Thanks--I know the spread sheet allows you to recalculate MOEs for more
than three
variables but I remember doing more than 3 a while back and I
was getting some wild MOEs. When I dug into it I found something in the
Census compass reports that said not to do more than three variables at a
time. I was hoping that someone figured out a way around this.
Ed C
On Mar 12, 2013, at 9:59 AM, "Hoctor Mulmat, Darlanne" <
Darlanne.Mulmat@sandag.org<mailto:Darlanne.Mulmat@sandag.org>> wrote:
The New York State Data Center developed a Statistical Calculations Menu
that
includes an option for computing the margin of error for the sum of
three or more estimates. See attached.
Darlanne Hoctor Mulmat
Applied Research Division - Criminal Justice/Public Policy San Diego
Association of Governments
619-699-7326
From:
ctpp-news-bounces@chrispy.net<mailto:ctpp-news-bounces@chrispy.net>
[mailto:ctpp-news-bounces@chrispy.net] On Behalf Of
Ed.Christopher@dot.gov<mailto:Ed.Christopher@dot.gov>
Sent: Tuesday, March 12, 2013 6:57 AM
To: ctpp-news@chrispy.net<mailto:ctpp-news@chrispy.net>
Subject: [CTPP] Working with County flow data
Has anyone come up with some easy ways for collapsing and grouping
counties
together using last week's county flow data and recalculating new
MOEs. I have so many counties that I want to group together that I am
looking for a quick way that can handle "lots" of counties. Another issue
I am struggling with is that we are always told not to group more than
three variables at a time or the formulas for calculating the new MOE do
not really work. This is particularly troublesome especially if I am
trying to group 17 counties together. What it comes down to is 9 different
calculations given that I can only group 3 counties at a time together.
Anyone figure out any short cuts or ways around this short of disregarding
the MOEs altogether? Given all the clustering that I am looking at using
the "cheat" sheets I am used to, I will be recalculating MOEs for weeks.
Ed Christopher
<StatisticalCalculationsMenu.xls>
_______________________________________________
ctpp-news mailing list
ctpp-news@ryoko.chrispy.net<mailto:ctpp-news@ryoko.chrispy.net>
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
_______________________________________________
ctpp-news mailing list
ctpp-news(a)ryoko.chrispy.net
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
_______________________________________________
ctpp-news mailing list
ctpp-news(a)ryoko.chrispy.net
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
--
Ed Christopher
708-283-3534 (V)
708-574-8131 (cell)
FHWA RC-TST-PLN
4749 Lincoln Mall Drive, Suite 600
Matteson, IL 60443
_______________________________________________
ctpp-news mailing list
ctpp-news(a)ryoko.chrispy.net
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
_______________________________________________
ctpp-news mailing list
ctpp-news(a)ryoko.chrispy.net
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
--
Patricia C. (Patty) Becker
APB Associates/Southeast Michigan Census Council (SEMCC)
28300 Franklin Rd, Southfield, MI 48034
office: 248-354-6520
home:248-355-2428
pbecker(a)umich.edu