Thanks Elaine. I haven't actually used OnTheMap other than to explore the web application and try some downloads but I thought that I would forward the caveats that I have run across.
Re: Version 3 OTM...(see: http://www.vrdc.cornell.edu/onthemap/data/v3/notes-otm-v3.0.pdf )
V3 Caveats:
•QWI numbers are drawn from QWI release R2008Q2. Stable jobs and statistics related to 'stable jobs' have a bug which has been fixed in QWI release R2008Q4. An update to the data is expected later this year. Consult the LEHD site for more details.
•Current QWI numbers are considered experimental. Since they are computed using the same confidentiality protection technology as the general QWI numbers, but using much finer geographic cells, there are more suppressions (more smaller numbers to protect). Users should be aware that when aggregating numbers to levels that are comparable to the general QWI data, the numbers they generate will be systematically lower.
•Only one implicate has been released at this point. Future updates should be counted (and used) as additional implicates. Additional implicates may be released in the future as well.
•WAC, OD, and QWI files are only available for states participating in the LEHD program. RAC files are available for all states, even those not actively participating in the LEHD program, but coverage is limited. For example, a worker of a NJ company (NJ participates in LEHD) may live in NY (which not yet been integrated). Thus, a residence area for this and other workers is defined, and available here for download. However, the residence area information will NOT include information on workers of NY companies, since that information was not available at the time that OnTheMap v3.0 was created. (This applies for v3.0 to: CT, DC, MA,
It appears that only one implicate has been released in Version 3.
Version 2 documentation from section 1.2.3 at http://www.vrdc.cornell.edu/onthemap/doc/otm_public_master.pdf indicates that three implicates were available and adds the following warning:
This version of the data provides 3 implicates (independent draws in the synthesizing algorithm) for the OD matrix and the Residence Area Characteristics (RAC) files. This is reflected in the filenames, see Sections 1.4.3 and 1.4.4. For further information on how to properly analyze multiply synthesized or imputed datasets, see Raghunathan et al. (2003); Reiter (2004b) and Reiter (2004a), or consult Sessions 8a and 8b of the online INFO 747 class at Cornell University's CISER at http://vrdc.ciser.cornell.edu/info747/. A note of warning is in order, though: It is statistically incorrect to use the average of the 3 implicates unless the aggregator function is strictly linear. Adding geographic areas is linear, and forming ratios from two linearly aggregated quantities (earnings over employment, for example) can be done correctly as long as the numerator and the denominator are averaged separately.