EconPapers    
Economics at your fingertips  
 

Building a Sample Frame of SMEs Using Patent, Search Engine, and Website Data

Arora Sanjay K. (), Kelley Sarah () and Madhavan Sarvothaman ()
Additional contact information
Arora Sanjay K.: Ernst & Young, LLP, 1101 New York Ave NW, Washington, D.C., 20005, U.S.A.
Kelley Sarah: Child Trends, 7315 Wisconsin Avenue, Suite 1200W, Bethesda, MD, 20814, U.S.A.
Madhavan Sarvothaman: American Institutes for Research, Washington, D.C., 20007, U.S.A.

Journal of Official Statistics, 2021, vol. 37, issue 1, 1-30

Abstract: This research outlines the process of building a sample frame of US SMEs. The method starts with a list of patenting organizations and defines the boundaries of the population and subsequent frame using free to low-cost data sources, including search engines and websites. Generating high-quality data is of key importance throughout the process of building the frame and subsequent data collection; at the same time, there is too much data to curate by hand. Consequently, we turn to machine learning and other computational methods to apply a number of data matching, filtering, and cleaning routines. The results show that it is possible to generate a sample frame of innovative SMEs with reasonable accuracy for use in subsequent research: Our method provides data for 79% of the frame. We discuss implications for future work for researchers and NSIs alike and contend that the challenges associated with big data collections require not only new skillsets but also a new mode of collaboration.

Keywords: Sample frame; administrative and big data; machine learning; bias; small and medium-sized enterprises (search for similar items in EconPapers)
Date: 2021
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.2478/jos-2021-0001 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:vrs:offsta:v:37:y:2021:i:1:p:1-30:n:6

DOI: 10.2478/jos-2021-0001

Access Statistics for this article

Journal of Official Statistics is currently edited by Annica Isaksson and Ingegerd Jansson

More articles in Journal of Official Statistics from Sciendo
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-20
Handle: RePEc:vrs:offsta:v:37:y:2021:i:1:p:1-30:n:6