Intrinsic bias estimation for improved analysis of bulk and single-cell chromatin accessibility profiles using SELMA
Shengen Shawn Hu,
Lin Liu,
Qi Li,
Wenjing Ma,
Michael J. Guertin,
Clifford A. Meyer,
Ke Deng,
Tingting Zhang and
Chongzhi Zang ()
Additional contact information
Shengen Shawn Hu: University of Virginia
Lin Liu: Shanghai Jiao Tong University
Qi Li: Tsinghua University
Wenjing Ma: University of Virginia
Michael J. Guertin: University of Connecticut
Clifford A. Meyer: Dana-Farber Cancer Institute
Ke Deng: Tsinghua University
Tingting Zhang: University of Pittsburgh
Chongzhi Zang: University of Virginia
Nature Communications, 2022, vol. 13, issue 1, 1-17
Abstract:
Abstract Genome-wide profiling of chromatin accessibility by DNase-seq or ATAC-seq has been widely used to identify regulatory DNA elements and transcription factor binding sites. However, enzymatic DNA cleavage exhibits intrinsic sequence biases that confound chromatin accessibility profiling data analysis. Existing computational tools are limited in their ability to account for such intrinsic biases and not designed for analyzing single-cell data. Here, we present Simplex Encoded Linear Model for Accessible Chromatin (SELMA), a computational method for systematic estimation of intrinsic cleavage biases from genomic chromatin accessibility profiling data. We demonstrate that SELMA yields accurate and robust bias estimation from both bulk and single-cell DNase-seq and ATAC-seq data. SELMA can utilize internal mitochondrial DNA data to improve bias estimation. We show that transcription factor binding inference from DNase footprints can be improved by incorporating estimated biases using SELMA. Furthermore, we show strong effects of intrinsic biases in single-cell ATAC-seq data, and develop the first single-cell ATAC-seq intrinsic bias correction model to improve cell clustering. SELMA can enhance the performance of existing bioinformatics tools and improve the analysis of both bulk and single-cell chromatin accessibility sequencing data.
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-022-33194-z Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-33194-z
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-022-33194-z
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().