Abstract:
Researchers using survey data must always deal with the problem of nonignorable non-response among the intended sample for a survey. Often, nothing more is known about a non-respondent individual or household other than the geographic location of their primary residence. However, this information can often be linked to the census sociodemographic characteristics of the approximate locality of each intended sample member. For example, in the absence of individual-specific data, some progress can be made in the task of sample selection correction by utilizing Census tract level information. We report upon the construction of a set of fifteen orthogonal factors that capture 88.7% of the variability across Census tracts in the 2000 Census. Since these factors are calculated using the universe of tracts, they may be interpreted as population measures, rather than sampling estimates. Our fifteen factors are available to other researchers, and can be accessed using a password available from the authors. In addition to their utility for survey sample selection corrections, this set of variables may also be useful as controls for neighborhood level heterogeneity in ordinary modeling contexts.