EconPapers    
Economics at your fingertips  
 

An Empirical Bayes approach for the identification of long-range chromosomal interaction from Hi-C data

Zhang Qi (), Xu Zheng and Lai Yutong
Additional contact information
Zhang Qi: Department of Mathematics and Statistics, University of New Hampshire, Durham, NH03824, USA
Xu Zheng: Department of Mathematics and Statistics, Wright State University, Dayton, OH45435, USA
Lai Yutong: ClinChoice, Fort Washington, PA19034, USA

Statistical Applications in Genetics and Molecular Biology, 2021, vol. 20, issue 1, 1-15

Abstract: Hi-C experiments have become very popular for studying the 3D genome structure in recent years. Identification of long-range chromosomal interaction, i.e., peak detection, is crucial for Hi-C data analysis. But it remains a challenging task due to the inherent high dimensionality, sparsity and the over-dispersion of the Hi-C count data matrix. We propose EBHiC, an empirical Bayes approach for peak detection from Hi-C data. The proposed framework provides flexible over-dispersion modeling by explicitly including the “true” interaction intensities as latent variables. To implement the proposed peak identification method (via the empirical Bayes test), we estimate the overall distributions of the observed counts semiparametrically using a Smoothed Expectation Maximization algorithm, and the empirical null based on the zero assumption. We conducted extensive simulations to validate and evaluate the performance of our proposed approach and applied it to real datasets. Our results suggest that EBHiC can identify better peaks in terms of accuracy, biological interpretability, and the consistency across biological replicates. The source code is available on Github (https://github.com/QiZhangStat/EBHiC).

Keywords: empirical Bayes; epigenetics; Hi–C; peak identification (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed

Downloads: (external link)
https://doi.org/10.1515/sagmb-2020-0026 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:20:y:2021:i:1:p:1-15:n:3

Ordering information: This journal article can be ordered from
https://www.degruyter.com/view/j/sagmb

DOI: 10.1515/sagmb-2020-0026

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2021-06-12
Handle: RePEc:bpj:sagmbi:v:20:y:2021:i:1:p:1-15:n:3