Projection-Uniform Subsampling Methods for Big Data
Yuxin Sun,
Wenjun Liu and
Ye Tian ()
Additional contact information
Yuxin Sun: Key Laboratory of Mathematics and Information Networks (Beijing University of Posts and Telecommunications), Ministry of Education, Beijing 100876, China
Wenjun Liu: Key Laboratory of Mathematics and Information Networks (Beijing University of Posts and Telecommunications), Ministry of Education, Beijing 100876, China
Ye Tian: Key Laboratory of Mathematics and Information Networks (Beijing University of Posts and Telecommunications), Ministry of Education, Beijing 100876, China
Mathematics, 2024, vol. 12, issue 19, 1-16
Abstract:
The idea of experimental design has been widely used in subsampling algorithms to extract a small portion of big data that carries useful information for statistical modeling. Most existing subsampling algorithms of this kind are model-based and designed to achieve the corresponding optimality criteria for the model. However, data generating models are frequently unknown or complicated. Model-free subsampling algorithms are needed for obtaining samples that are robust under model misspecification and complication. This paper introduces two novel algorithms, called the Projection-Uniform Subsampling algorithm and its extension. Both algorithms aim to extract a subset of samples from big data that are space-filling in low-dimensional projections. We show that subdata obtained from our algorithms perform superiorly under the uniform projection criterion and centered L 2 -discrepancy. Comparisons among our algorithms, model-based and model-free methods are conducted through two simulation studies and two real-world case studies. We demonstrate the robustness of our proposed algorithms in building statistical models in scenarios involving model misspecification and complication.
Keywords: model-free subsampling; space-filling design; uniform projection criterion; centered L 2 -discrepancy (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/19/2985/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/19/2985/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:19:p:2985-:d:1485701
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().