Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis

Li, Ben; Li, Yunxiao; Qin, Zhaohui S.

Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis

Ben Li, Yunxiao Li and Zhaohui S. Qin ()
Additional contact information
Ben Li: Emory University
Yunxiao Li: Emory University
Zhaohui S. Qin: Emory University

Statistics in Biosciences, 2017, vol. 9, issue 1, No 5, 73-90

Abstract: Abstract Modern high-throughput biotechnologies such as microarray and next-generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical “large p, small n” problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the fact that in the Big Data era, large amount of historical data are available which should be taken advantage of. Our strategy presents a new framework to enhance the Bayesian hierarchical model. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategy. Our new strategy also enables borrowing information across different platforms which could be extremely useful with emergence of new technologies and accumulation of data from different platforms in the Big Data era. Our method has been implemented in R package “adaptiveHM,” which is freely available from https://github.com/benliemory/adaptiveHM.

Keywords: Bayesian hierarchical model; Historical data; Informative prior; 450K methylation array; Bisulfite sequencing (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s12561-016-9156-x Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:stabio:v:9:y:2017:i:1:d:10.1007_s12561-016-9156-x

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/12561

DOI: 10.1007/s12561-016-9156-x

Access Statistics for this article

Statistics in Biosciences is currently edited by Hongyu Zhao and Xihong Lin

More articles in Statistics in Biosciences from Springer, International Chinese Statistical Association
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().