Extracting Microdata
Canadian Parliament United Nations The US Capitol Illinois Capitol Building Government Documents Homepage
Tips for Extracting Microdata
  home

Technical notes and tips for extracting microdata are available on the websites of major data sources. Listed below are some of the notable ones.
Efficiency Tips for Extracting PUMS Data

When extracting data from the US PUMS, minor changes in programming strategy can lead to major rewards in performance. Here are some tips which are especially relevant to accessing data from the census, provided by UCLA Academic Technology Service. 
 

Introduction to Data Handling

This is a handout provided by the University of Chicago 's data archivists. It is intended to introduce you to the basics required to extract data and converting "raw" data into a dataset to be used by a statistical application, specifically SPSS or SAS. It illuminates topics such as reading a codebook, identifying data structures, and developing programs for reading raw data into a statistical application.

How to Produce a Data Extract with the Data Extraction System (DES)


In order to produce a data extract using the U. S. Census Bureau's Data Extraction System version 4.2 by means of a Hyper Text Markup Language (HTML) tool you must have: 1) A 'browser tool' (e.g. Mosaic, Netscape, Lynx, etc.) which supports 'HTML-forms' and can access cache memory, 2) a computer with enough memory - cache to support very large HTML-form documents. In order to produce an extract using the Data Extraction System you must complete a six 'Level' process of defining your extraction. 

FAQ with Data FERRET System

Tips for extracting data via FERRET of the U. S. Census Bureau. Answers are provided for questions such as: 1) How do you convert a sas transport dataset back to a regular sas dataset ? and 2) I am trying to extract eight items from the CPS Microdata. But every time I hit the enter button on the second item, the first item gets unhighlighted. Is there something wrong with my machine, or can I only get one item at a time? (I am using Netscape on my PC).

Accessing Online Census Data Files at ATS, UCLA


UCLA Academic Technology Services (ATS) provides online access to selected US Census data files as SAS data files. Even though these files are stored as SAS data files, you do not need to be a SAS expert to access and use them. This documentation illustrates a number of different strategies which can be used to access this census data.

Extracting Data in the Harvard-MIT Data Center

This tutorial is intended to help you to take a dataset in some known but inconvenient format, and convert it to a more usable format, extracting only the variables and cases that interest you. For most data extraction, the program DBMS/Copy is recommend , although a few tasks require SAS or SPSS.

Tips with the Enhanced Extract System of IPUMS, University of Minnesota

On Friday, August 18, 2000 the old IPUMS extract system was replaced by a new system incorporating enhanced features requested by users. One of the key features of the new system is the ability to modify and resubmit previous jobs.  Here are the most common technical aspects of the extract conversion that might affect data users and strategies fro them.

Using SETS to Extract NHIS and Other NCHS Data on CD-ROM

This is a tip from the Electronic Data Services at Columbia University. It illustrates how NHIS data on CD-ROMs can be accessed by using NCHS's Statistical Export and Tabulation System (SETS) software. This technical note on the SETS User Interface will guide you through the process of extracting a subset of the NHIS data from the CD-ROM for your analysis. 

Caveats and an Example of Data Extration

Tips for accessing data from the Bermuda Biological Station for Research (BBSR).  As part of the U.S. Joint Global Ocean Flux Study (JGOFS), researchers from the Bermuda Biological Station are conducting long time-series studies of  biogeochemical cycles in the Sargasso Sea near Bermuda. The tutorial is to provide assistance for extracting time series biological data sets prepared for and produced by the research. 

CPS Utilities, Unicon Research Corporation

Despite their importance to the research community, the Current Population Survey (CPS) files distributed by the Census Bureau are inconvenient to use in several ways, particularly for the novice but even for those experienced in the use of these data. The CPS Utilities, consisting of CDs containing data, documentation and Windows software, can help researchers easily find and extract data from the U.S. Census Bureau's Current Population Survey.  Extractions are formatted as Stata datasets, or as raw ASCII files with SAS and SPSS input code.   Each extraction is documented in a report file.   Options allow selection of observations, renaming of variables, smart recoding, and more.  Powerful search utilities allow variables of interest to be identified quickly.  Documentation for each variable is consolidated on a single page covering all survey years, and is hyperlinked to documentation for related variables and appendices.