Tim,
I am currently working with the InfoGroup dataset for San Luis Obispo County (California Central Coast; population 270,000; 7 cities and the county, approximately 15,000 business records), as part of a regional land use and travel model improvement plan. As we are doing land use modeling at the parcel-level, we are in the process of linking employment point data to the parcel data. At the same time, our local/regional branch of CalFire is developing an address point shape file/database for the whole county.
InfoGroup dataset includes LAT-LONG, so you are able to display data using coordinate data, however most employment points will not align with the correct parcel. However, they will likely align with the correct grid cell (whether 250m or 1km grid cell), based on the level of detail at which you’re working.
As we’re working at a very fine level of detail here (parcel level), I’ll spare you the details on fixing the alignment of business/employer points, except to say that we are linking the unique IDs of the address point dataset with the business/employer records.
As far as the quality of the data, there are duplicate records to contend with. Some typical issues:
· A business may have changed names over the years, but still stayed in business: you’ll likely have two records for that single business;
· Multiple address/business records for the one major university in the region, which indicate various university departments, special facilities, etc. (with the number of universities in the Greater Boston region, this could be a concern no matter what dataset you go with)
· Multiple address/business records for places like medical offices, where a single medical office or practice may have a half-dozen (or many more) physicians, but it’s really just one employment location;
· A similar problem for hospitals: many business addresses that represent various departments of the medical facility, but it’s really just one employment;
· PO Boxes: Although no physical address may be associated with these data records, you can still rely on coordinate data to display these data points;
· Non-standard addresses: difficult to geocode, but typically not a large percentage of the overall dataset;
· Work at home: There is a field that indicates “at-home” businesses, so you may want to sort those out and handle them separately.
You can geocode the dataset (or significant portions of the dataset) where you have good, standard addresses. Overall, the addresses associated with the data seem to be very good; unit or suite numbers are included as necessary. The fields for “SECONDARY_ADDRESS” seem to all be local physical addresses, whereas the fields for “PRIMARY_ADDRESS” seem to include some out-of-area addresses. Branch vs. HQ issue do not seem prevalent. Public employment data is rather shaky; probably best to rely on state-level “employment development department” or equivalent.
I mentioned the address point file that is in progress in our region, as it may be something to consider in your region: Is there a master address file for some or all of the region? If it does exist, more than likely the spatial alignment would be very good, and it may pay off to link the new dataset with a regional address point dataset.
Feel free to follow-up with any questions.
Thanks!
Geoffrey Chiapella
Transportation Planner
San Luis Obispo Council of Governments
1114 Marsh Street
San Luis Obispo, CA 93401
(805) 781-5190
gchiapella@slocog.org
Geoffrey Chiapella
Transportation Planner
San Luis Obispo Council of Governments
1114 Marsh Street
San Luis Obispo, CA 93401
(805) 781-5190
gchiapella@slocog.org
From: ctpp-news-bounces@chrispy.net [mailto:ctpp-news-bounces@chrispy.net] On Behalf Of Reardon , Tim
Sent: Tuesday, June 28, 2011 12:15 PM
To: 'ctpp-news@chrispy.net'
Subject: [CTPP] Proprietary Employer Data -- Comments on the variousproviders?
I have managed to assemble some funding to acquire employer data for our 164-municipality transportation modeling region in Eastern MA, and I am wondering if any of you have comments on the accuracy and utility of the various proprietary employer data sources currently available. I have been in conversations with two major providers (InfoGroup and Dun & Bradstreet) and have received sample files for certain zip codes, but it is hard to assess the accuracy or completeness of either sample.
I’m wondering if anybody can offer insight on working with such data, and whether you have suggestions on choosing a vendor. We will be using it primarily to determine employment by sector at very fine geographies (250m or 1km grid cells) for land use planning and analysis. Some concerns we have already identified include: branch vs. headquarters employment, public sector employment, “paper companies” and verification, and the accuracy of the goecoded location that accompanies each record.
Any thoughts are appreciated. Feel free to reply off-list if concerned about publicly trumpeting or bashing somebody’s product.
Thanks,
Tim Reardon
___________________________________________
Timothy G. Reardon -- Senior Regional Planner
Metropolitan Area Planning Council
60 Temple Place | Boston, MA 02111
617-451-2770 x2011
Say it with a map! Find data and bring it alive at www.MetroBostonDataCommon.org!
Please be advised that the Massachusetts Secretary of State considers e-mail to be a public record, and therefore subject to the Massachusetts Public Records Law, M.G.L. c. 66 § 10.