Economics at your fingertips  

Breakthroughs in Historical Record Linking Using Genealogy Data: The Census Tree Project

Kasey Buckles, Adrian Haws, Joseph Price and Haley E.B. Wilbert

No 31671, NBER Working Papers from National Bureau of Economic Research, Inc

Abstract: The Census Tree is the largest-ever database of record links among the historical U.S. censuses, with over 700 million links for people living in the United States between 1850 and 1940. These high-quality links allow researchers in the social sciences and other disciplines to construct a longitudinal dataset that is highly representative of the population. In this paper, we describe our process for creating the Census Tree, beginning with a collection of over 317 million links contributed by the users of a free online genealogy platform. We then use these links as training data for a machine learning algorithm to make new matches, and incorporate other recent efforts to link the historical U.S. censuses. Finally, we introduce a procedure for filtering the links and adjudicating disagreements. Our complete Census Tree achieves match rates between adjacent censuses that are between 69 and 86% for men, and between 58 and 79% for women. The Census Tree includes women and Black Americans at unprecedented rates, containing 314 million links for the former and more than 41 million for the latter.

JEL-codes: C81 J10 N01 (search for similar items in EconPapers)
Date: 2023-09
New Economics Papers: this item is included in nep-big, nep-evo, nep-his and nep-lab
References: Add references at CitEc

Downloads: (external link) (application/pdf)
Access to the full text is generally limited to series subscribers, however if the top level domain of the client browser is in a developing country or transition economy free access is provided. More information about subscriptions and free access is available at Free access is also available to older working papers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link:

Ordering information: This working paper can be ordered from
The price is Paper copy available by mail.

Access Statistics for this paper

More papers in NBER Working Papers from National Bureau of Economic Research, Inc National Bureau of Economic Research, 1050 Massachusetts Avenue Cambridge, MA 02138, U.S.A.. Contact information at EDIRC.
Bibliographic data for series maintained by ().

Page updated 2024-03-31
Handle: RePEc:nbr:nberwo:31671