Statistics Evolution and Revolution to Meet Data Science Challenges
Hulin Wu ()
Additional contact information
Hulin Wu: University of Texas Health Science Center at Houston
Statistics in Biosciences, 2025, vol. 17, issue 3, No 12, 813-831
Abstract:
Abstract The advent of the Big Data era has necessitated a transformational shift in statistical research, responding to the novel demands of data science. Despite extensive discourse within statistical communities on confronting these emerging challenges, we offer our unique perspectives, underscoring the extended responsibilities of statisticians in pre-analysis and post-analysis tasks. Moreover, we propose a new definition and classification of Big Data based on data sources: Type I Big Data, which is the result of aggregating a large number of small datasets via data sharing and curation, and Type II Big Data, which is the Real-World Data (RWD) amassed from business operations and practices. Each category necessitates distinct data preprocessing and preparation (DPP) methods, and the objectives of analysis as well as the interpretation of results can significantly diverge between these two types of Big Data. We further suggest that the statistical communities should consider adopting and rapidly incorporating new paradigms and cultures by learning from other disciplines. Particularly, beyond Breiman’s (Stat Sci 16(3):199–231, 2021) two modeling cultures, statisticians may need to pay more attention to a newly emerging third culture: the integration of algorithmic modeling with multi-scale dynamic modeling based on fundamental physics laws or mechanisms that generate the data. We draw from our experience in numerous related research projects to elucidate these novel concepts and perspectives.
Keywords: Data curation; Pre-analysis tasks; Post-analysis tasks (PAT); Data preprocessing and preparation (DPP); Third modeling culture (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s12561-024-09454-5 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:stabio:v:17:y:2025:i:3:d:10.1007_s12561-024-09454-5
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/12561
DOI: 10.1007/s12561-024-09454-5
Access Statistics for this article
Statistics in Biosciences is currently edited by Hongyu Zhao and Xihong Lin
More articles in Statistics in Biosciences from Springer, International Chinese Statistical Association
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().