EconPapers    
Economics at your fingertips  
 

Fast inference for quantile regression with tens of millions of observations

Sokbae Lee, Yuan Liao, Myung Hwan Seo and Youngki Shin

Journal of Econometrics, 2025, vol. 249, issue PA

Abstract: Big data analytics has opened new avenues in economic research, but the challenge of analyzing datasets with tens of millions of observations is substantial. Conventional econometric methods based on extreme estimators require large amounts of computing resources and memory, which are often not readily available. In this paper, we focus on linear quantile regression applied to “ultra-large” datasets, such as U.S. decennial censuses. A fast inference framework is presented, utilizing stochastic subgradient descent (S-subGD) updates. The inference procedure handles cross-sectional data sequentially: (i) updating the parameter estimate with each incoming “new observation”, (ii) aggregating it as a Polyak–Ruppert average, and (iii) computing a pivotal statistic for inference using only a solution path. The methodology draws from time-series regression to create an asymptotically pivotal statistic through random scaling. Our proposed test statistic is calculated in a fully online fashion and critical values are calculated without resampling. We conduct extensive numerical studies to showcase the computational merits of our proposed inference. For inference problems as large as (n,d)∼(107,103), where n is the sample size and d is the number of regressors, our method generates new insights, surpassing current inference methods in computation. Our method specifically reveals trends in the gender gap in the U.S. college wage premium using millions of observations, while controlling over 103 covariates to mitigate confounding effects.

Keywords: Large-scale inference; Stochastic gradient descent; Subgradient (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0304407624000198
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:econom:v:249:y:2025:i:pa:s0304407624000198

DOI: 10.1016/j.jeconom.2024.105673

Access Statistics for this article

Journal of Econometrics is currently edited by T. Amemiya, A. R. Gallant, J. F. Geweke, C. Hsiao and P. M. Robinson

More articles in Journal of Econometrics from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-05-20
Handle: RePEc:eee:econom:v:249:y:2025:i:pa:s0304407624000198