This e-mail is a follow-up to the another posted a while ago. Daryl Scott from South Western Regional Planning Agency wrote to me a note on how he converted SF1 data into an Oracle database, and subsequently related it to Arcview. Attached is his note.
SF1 has a number of tables. Oracle and other databases are especially capable of handling huge tables and files. Darryl's note will be useful to anyone interested in developing extensive Census 2000 databases for their planning region.
Thanks
Nanda Srinivasan
**********************************************
INCOMING from Darryl Scott:
Here is my experience with importing the data into Oracle.
My goal was to load all the data for the State of Connecticut, all Counties
in Connecticut, all Towns in Connecticut, and all tracts, block groups, and
blocks in Fairfield County from Census 2000 Summary File 1 into Oracle and
link it to GIS. Below are the steps I took to accomplish that task. I do
not describe the failed attempts in detail.
1. I downloaded the data in ASCII format for Connecticut from the Census Web
Site and extracted all the files.
2. I imported the ctgeo file into Oracle.
3. Then I created a view that contained all the records that I wanted to
import into Oracle.
---
create view ctfcgeo as
SELECT sumlev, geocomp, logrecno, state, county, cousub, cousubcc, place,
placecc, placedc, tract, blkgrp, block, msacmsa, cmsa, macci,
pmsa, necma, necmacci, ua, uatype, ur, sldu, sldl, vtd, vtdi, zcta3, zcta5,
arealand, areawatr, arealand/2590000 as area_sqmi, name,
funcstat, gcuni, pop100, hu100, sdelm, sdsec, sduni, taz, macc, uacp, stfid
FROM ctgeo
WHERE (sumlev = '040' and geocomp='00') or (sumlev='050') or
(sumlev = '060' and Name Like '% town') or (sumlev = '140' and county =
'001') or
(sumlev = '150' and county = '001') or (sumlev = '101' and county = '001');
---
4. I tried to import one data file and delete the unneeded records with a
SQL command, but the process proved to be very inefficient. So I wrote an
Avenue Script that extracts the desirable records from sf1 data files. I
saved the list of good values from the logrecno field to a text file. The
script created one .dat file for each each .uf1 file. In addition, the
script removed the first four fields from the uf1 file. The script caused
ArcView and even the computer to hang, but that was how I knew the script
was working.
---
mypath="d:\gis\gisdata_temp\Census2000\sumfile1\"
file_list={"ct00005","ct00006","ct00007","ct00008","ct00009",
"ct00010","ct00011","ct00012","ct00013","ct00014","ct00015","ct00016","ct000
17","ct00018","ct00019",
"ct00020","ct00021","ct00022","ct00023","ct00024","ct00025","ct00026","ct000
27","ct00028","ct00029",
"ct00030","ct00031","ct00032","ct00033","ct00034","ct00035","ct00036","ct000
37","ct00038","ct00039"}
rec_file=TextFile.Make((mypath+"ctfcgeo_logrecno.txt").AsFileName,#FILE_PERM
_READ)
rec_source=rec_file.Read(rec_file.GetSize)
rec_file.Close
rec_list_string = {}
rec_list_number = {}
rec_list_string=rec_source.AsTokens(nl)
for each rec in rec_list_string
rec_list_number.add(rec.AsNumber)
end
for each file_prefix in file_list
data_file_name=(mypath+file_prefix+".uf1").AsFileName
newdata_file_name=(mypath+file_prefix+".dat").AsFileName
data_file=LineFile.Make(data_file_name,#FILE_PERM_READ)
newdata_file=LineFile.Make(newdata_file_name,#FILE_PERM_WRITE)
for each i in 0..(data_file.GetSize-1)
dline=data_file.ReadElt
if (dline=nil) then continue end
dline=dline.Right(dline.Count-15)
d_recno=dline.Left(7)
d_num=d_recno.AsNumber
results=rec_list_number.FindByValue(d_num)
if (results= -1) then continue end
newdata_file.WriteElt(dline)
end
newdata_file.Close
data_file.Close
end
msgbox.info("Finished Processing Files","Script")
---
5. The following day, I created the SQL statements to create the tables in
Oracle and control files to import the information in the dat files. (These
files were compressed into a zip file and attached to this document.) I
decided to use the same field names as described in the data dictionary. I
sought help from Nanda Srinivasan at FHWA who told me about the sf1combo.xls
file found at http://mcdc2.missouri.edu/data/sf12000/Tools/. That file
saved hours of work because I was able to create most of my SQL and control
files from it. I still had to explore the data files to see which fields
had decimal places, but fortunately most of the data did not have decimal
places. I also added the number of the datafile to the logrecno field name
to make sure that each table had unique fields (e.g. logreno1 for ct00001,
logrecno2 for ct00002, and so on.) I also used TextPad
(http://www.textpad.com) to create the SQL and control files because of its
ability to select text as blocks rather than lines. The SQL and control
files worked and I was able to import the data from SF1 into Oracle.
6. Then I had to link the data to TIGER 2000 files in shape file format. I
already had the shape files from http://www.geographynetwork.com/. I
created some views in Oracle where I joined the data files to the geography
file and used the [logreno] field as the common field. Then, I used
ArcView's Database Access Extension to add the Oracle views to a ArcView
project. Except for the town level, I discovered that I needed to create a
common field to link the Oracle views to the shape files. Thus, I created a
stfid field in the ctgeo table with the SQL statements below.
---
alter table ctgeo add
stfid VARCHAR2(16);
update ctgeo
set stfid = state||county||tract
where sumlev='140';
update ctgeo
set stfid = state||county||tract||blkgrp
where sumlev='150';
update ctgeo
set stfid = state||county||tract||block
where sumlev='101';
---
7. To improve performance, I created indexes on the logrecno* field on each
sf1 table. I also indexed the sumlev field and stfid field in the ctgeo
field.
It took a little bit longer than expected, but I accomplished my goals. I
imported the sf1 data into Oracle and became able to produce thematic maps
in ArcView. Because of this approach, the sf1 data can be used in ArcView,
Microsoft Access, and any other software that can connect to the Oracle
database through an ODBC connection.
<<sf1_oracle_sql.zip>>
--
Daryl Scott
South Western Regional Planning Agency
Stamford Government Center
888 Washington Blvd., 3rd Floor
Stamford, CT 06901
Tel: (203) 316-5190
Fax: (203) 316-4995
E-mail: dscott(a)swrpa.org
The CTPP Working Group is currently developing summary levels for reporting CTPP 2000 data. A summary level in CTPP 2000 is a geographic unit of reporting. For example, urban TAZ is a reporting summary level. The hierarchy of reporting TAZs will be State-County-TAZ (meaning that State level totals, County level Totals will be present as separate records).
We want your inputs on two reporting issues. Attached is a document (listservsumlev.doc) that explains these issues.
Please send your suggestions to Nanda Srinivasan (Email: ctpp(a)fhwa.dot.gov, Phone: 202-366-5021)
Call for Poster Session Papers/Presentations
The Subcommittee on Census Data for Transportation Planning (A1D08-1) is
interested in developing a poster session for the 81st TRB Annual
Meetings in January 2002. The subject of the poster session will center
on the innovative and creative ways, in which census related data is
being presented, displayed or delivered.
Under the TRB guidelines, a poster session is a series of presentations
on vertical display boards with direct interaction between the presenter
and attendees. The entire presentation is placed on a display board and
should be considered the equivalent to the conventional paper or
presentation sessions.
Typically, a TRB Poster Session is made up of reviewed papers. However,
due to the evolving nature of the subject and the fact that the US
Census related data is just now being released--time is
short--presentations will be considered.
Individuals interested in sharing some of the innovative and creative
ways in which they are displaying and making Census data available
within their transportation community are encouraged to "show their
work". Those seeking publication as part of the TRB Research Record
series need to have their paper submitted, according to TRB guidelines
no later than August 1, 2001. For more information on the paper
submittal process or the Annual meeting refer to;
http://www4.trb.org/trb/annual.nsf
For those wishing to present their materials without seeking full
publication may submit an abstract by August 1, 2001 to Ed Christopher,
Chair of the Subcommittee on Census Data for Transportation Planning at
the address and phone number below.
More detailed information and general instructions for a TRB Poster
Session can be found at
http://www.nas.edu/trb/archives/publications/am/poster.pdf
Should you have any questions please contact either Ed Christopher,
Subcommittee Chair, or Chuck Purvis Chair of the Urban Data and
Information Systems Committee (A1D08).
Ed Christopher
Transportation Industry Analyst
Bureau of Trans. Statistics K-30
400 Seventh Street, SW
Washington D.C. 20590
202-366-0412
edc(a)bts.gov
Chuck Purvis, AICP
Senior Transportation Planner Metropolitan Trans. Commission
101 Eighth Street
Oakland, CA 94607
510- 464-7731
cpurvis(a)mtc.ca.gov
The Census Bureau started release of Summary File 1 (SF1) data from June 13, 2001 on a state-by-state basis.
SF1 contains the 100-percent data, which is the information compiled from
the questions asked of all people and about every housing unit. Most of the SF1 data is reported at the Census Block level. Since the Transportation Analysis Zones defined in TIGER/Line 2000 are aggregations of Census Blocks, transportation planning agencies can assemble SF1 data for their TAZs.
Attached is a note on SF1 data and a process to convert SF1 data to your TAZs.
Note to SAS and SPSS users: A code to automate the data transfer is posted at:
http://www.sdcbidc.iupui.edu/Profiles/profiles.html
Thank you
Nanda Srinivasan
Agencies have written to the CTPP Working Group asking:
If TAZs defined in TIGER/Line 2000 can be altered/changed.
Can an agency submit new TAZs in Urban areas based on TIGER/Line 2000 geography for CTPP 2000?
The CTPP Working Group met on June 14, 2001 to discuss these issues. We decided that agencies could define new TAZs or alter TAZs in TIGER/Line 2000 and obtain CTPP 2000 data reported for the new TAZs. However, agencies must use Census 2000 tabulation blocks to build new TAZs or alter existing TAZs. In effect, an equivalency file between census blocks and TAZs will be built and used, as was done in previous censuses.
However, due to processing issues and schedule considerations, agencies wishing to do this should note the following:
1. Cost: To change TAZs or define new TAZs, agencies must separately contract with the Census Bureau, Journey to Work Branch, and work out a process and cost for defining and obtaining CTPP 2000 data based on the new TAZs.
2. Schedule: CTPP data will be processed for those agencies after the Standard CTPP 2000 products are released for the whole country. Currently, the expected completion date for CTPP 2000 is June 2003.
3. Exclusion of the changes from TIGER/Line database: There are no plans or processes in place to include new TAZs in the TIGER/Line database. Thus, the changes will not be reflected in TIGER/Line 2000 and it is unlikely that new TAZs will be inserted into TIGER until the next round of TAZ updating, the date of which has not been scheduled.
For more information, please contact Phillip Salopek, Census Bureau, Population Division (Journey to Work Branch) at 301-457-2454 (e-mail: phillip.a.salopek(a)census.gov).
We have been working on defining the proposed urbanized area for our
region.
However, we are encountering some issues with the "hop" step -- namely,
is there a quicker way in ArcView to identify eligible blocks or other
unconnected densely populated areas (BG cores + blocks) not contiguous
to the "core" (and their corresponding connecting paths), rather than
manually determining the shortest qualifying path between the block and
the core for each individual instance?
Thanks.
Haila R. Maze, Senior Planner
Berkeley Charleston Dorchester Council of Governments
5290 Rivers Avenue, Suite 400
North Charleston, SC 29406
(843) 529-0400
(843) 529-0305 fax
-----Original Message-----
From: Elaine Murakami [mailto:Elaine.Murakami@igate.fhwa.dot.gov]
Sent: Monday, April 09, 2001 12:31 PM
To: CTpp-news(a)chrispy.net
Subject: [CTPP] INFORMATION: Urbanized Area Proposed Definition
This information (probably in revised form) will be sent to the FHWA
field offices within the next day. However, I am sending this along to
the CTPP listserv to get it out as soon as possible.
----------------------------------------
Attached is a document to assist those areas which may want to evaluate
what the impact of the proposed definition of Urbanized Areas may be for
their area. You will need to know how to use GIS to conduct this
evaluation. While we conferred with the Census Bureau staff to assure
that these notes are correct, we cannot guarantee them. You will want
to print this in COLOR to be able to read the maps. Technical questions
should be addressed to ua(a)geo.census.gov. You may also call the Census
Bureau Geography Division at 301-457-1099.
Comments to the Federal Register notice (posted at
www.census.gov/geo/www/ua/ua_2k.html) are due by April 27, 2001.
Comments should be submitted to Director, U.S. Census Bureau, Room 2049,
Fedral Building 3, Washington, D.C. 20233-0001.
The FHWA Office of Metropolitan Planning prepared these notes to assist
local areas, however, we will not be checking individual areas against
these proposed criteria, and will not be able to provide technical
assistance in operating ArcView or ArcInfo.
The Census Bureau makes a distinction between special places and group
quarters. It's kind of arcane, and in cases where it's just one facility
where the administration takes place at the same address that houses the
residents, it doesn't really matter. But in the case of multiple college
dorms it can matter a lot.
In the case of a college, the special place is the contact point for the
enumeration, most likely the college housing office.
The individual dorms are each a separate group quarters.
It sounds to me like they geocoded all the GQ facilities to the address of
the special place. I'd be willing to bet that the block where all the
Cornell students are enumerated is where the housing office is located.
This is sloppy, sloppy work and yes, they should fix it before SF3!
Patty Becker
-------------------------------------------------------------------------------------------------
Patricia C. (Patty) Becker 248/354-6520
APB Associates/SEMCC FAX 248/354-6645
28300 Franklin Road Home 248/355-2428
Southfield, MI 48034 pbecker(a)umich.edu
It looks like SF (Summary File) 1 data is starting to be released on a
flow basis. Delaware is now out and Vermont appears to be right
behind. Below is a link to a weekly release schedule that is posted on
a special page on the Census Bureau's Web site.
http://www.census.gov/Press-Release/www/2001/sumfile1.html
-----
Ed Christopher
Bureau of Transportation Statistics
U.S. Department of Transportation
400 Seventh Street SW
Washington DC 20590
202-366-0412
We are looking for a way to geocode addresses in our 7 county region using Arcview and TIGER. Our problem is street addresses with similar address ranges that occur in several civil divisions. We have built a file containing all the TIGER data for each of our 7 counties. We are looking for information on how to incorporate the civil division name in the address coding process. Our experience to date has been that Arcview selects the first address range it encounters and assigns that geocode.
Does anyone have any suggestions?
John Zastrow
Principal Specialist
Southeastern Wisconsin Regional Planning Commission
916 N. East Avenue
Waukesha, WI 53186
262.522.9099
Hi folks
would someone be able to send me a link to the spot where one can download
ZCTA geography? I had it once but cant seem to find it or get back there..
Jesse Jacobs
Transportation Planner/GIS Coordinator
AVCOG, 125 Manley RD, Auburn, ME 04210
Phone (207) 783-9186
Faxx (207) 783-5211
e-mail jjacobs(a)avcog.org