EconPapers    
Economics at your fingertips  
 

An Econometric Perspective on Algorithmic Subsampling

Sokbae (Simon) Lee and Serena Ng ()

Annual Review of Economics, 2020, vol. 12, issue 1, 45-80

Abstract: Data sets that are terabytes in size are increasingly common, but computer bottlenecks often frustrate a complete analysis of the data, and diminishing returns suggest that we may not need terabytes of data to estimate a parameter or test a hypothesis. But which rows of data should we analyze, and might an arbitrary subset preserve the features of the original data? We review a line of work grounded in theoretical computer science and numerical linear algebra that finds that an algorithmically desirable sketch, which is a randomly chosen subset of the data, must preserve the eigenstructure of the data, a property known as subspace embedding. Building on this work, we study how prediction and inference can be affected by data sketching within a linear regression setup. We use statistical arguments to provide “inference-conscious” guides to the sketch size and show that an estimator that pools over different sketches can be nearly as efficient as the infeasible one using the full sample.

Date: 2020
References: Add references at CitEc
Citations: View citations in EconPapers (7)

Downloads: (external link)
https://doi.org/10.1146/annurev-economics-022720-114138
Full text downloads are only available to subscribers. Visit the abstract page for more information.

Related works:
Working Paper: An Econometric Perspective on Algorithmic Subsampling (2020) Downloads
Working Paper: An econometric perspective on algorithmic subsampling (2020) Downloads
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:anr:reveco:v:12:y:2020:p:45-80

Ordering information: This journal article can be ordered from
http://www.annualreviews.org/action/ecommerce

DOI: 10.1146/annurev-economics-022720-114138

Access Statistics for this article

More articles in Annual Review of Economics from Annual Reviews Annual Reviews 4139 El Camino Way Palo Alto, CA 94306, USA.
Bibliographic data for series maintained by http://www.annualreviews.org ().

 
Page updated 2025-03-28
Handle: RePEc:anr:reveco:v:12:y:2020:p:45-80