EconPapers    
Economics at your fingertips  
 

Projection-Uniform Subsampling Methods for Big Data

Yuxin Sun, Wenjun Liu and Ye Tian ()
Additional contact information
Yuxin Sun: Key Laboratory of Mathematics and Information Networks (Beijing University of Posts and Telecommunications), Ministry of Education, Beijing 100876, China
Wenjun Liu: Key Laboratory of Mathematics and Information Networks (Beijing University of Posts and Telecommunications), Ministry of Education, Beijing 100876, China
Ye Tian: Key Laboratory of Mathematics and Information Networks (Beijing University of Posts and Telecommunications), Ministry of Education, Beijing 100876, China

Mathematics, 2024, vol. 12, issue 19, 1-16

Abstract: The idea of experimental design has been widely used in subsampling algorithms to extract a small portion of big data that carries useful information for statistical modeling. Most existing subsampling algorithms of this kind are model-based and designed to achieve the corresponding optimality criteria for the model. However, data generating models are frequently unknown or complicated. Model-free subsampling algorithms are needed for obtaining samples that are robust under model misspecification and complication. This paper introduces two novel algorithms, called the Projection-Uniform Subsampling algorithm and its extension. Both algorithms aim to extract a subset of samples from big data that are space-filling in low-dimensional projections. We show that subdata obtained from our algorithms perform superiorly under the uniform projection criterion and centered L 2 -discrepancy. Comparisons among our algorithms, model-based and model-free methods are conducted through two simulation studies and two real-world case studies. We demonstrate the robustness of our proposed algorithms in building statistical models in scenarios involving model misspecification and complication.

Keywords: model-free subsampling; space-filling design; uniform projection criterion; centered L 2 -discrepancy (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/19/2985/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/19/2985/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:19:p:2985-:d:1485701

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:12:y:2024:i:19:p:2985-:d:1485701