EconPapers    
Economics at your fingertips  
 

Systematic clustering algorithm for chromatin accessibility data and its application to hematopoietic cells

Azusa Tanaka, Yasuhiro Ishitsuka, Hiroki Ohta, Akihiro Fujimoto, Jun-ichirou Yasunaga and Masao Matsuoka

PLOS Computational Biology, 2020, vol. 16, issue 11, 1-27

Abstract: The huge amount of data acquired by high-throughput sequencing requires data reduction for effective analysis. Here we give a clustering algorithm for genome-wide open chromatin data using a new data reduction method. This method regards the genome as a string of 1s and 0s based on a set of peaks and calculates the Hamming distances between the strings. This algorithm with the systematically optimized set of peaks enables us to quantitatively evaluate differences between samples of hematopoietic cells and classify cell types, potentially leading to a better understanding of leukemia pathogenesis.Author summary: High-throughput sequencing provides us huge amounts of data about gene regulation. In order to extract useful information from the data, data reduction is needed. Although RNA-seq data analysis has been extensively studied, where the focus is mainly on genetic loci, tools for epigenetic sequencing data, such as ATAC-seq data which represent chromatin accessibility, are comparatively lacking. Since the binding of transcription factors mainly occurs in open chromatin regions, it is presumably important to understand how chromatin accessibility landscape affects cell phenotype. In this context, we developed a systematic algorithm to select a set of peaks representing the open state of chromatin for a given sample of ATAC-seq data. This algorithm quantifies the difference between samples by regarding the genome as a string of 1s and 0s with Hamming distances and then performs hierarchical clustering. This algorithm has less computational cost and gives a reasonable cell type classification compared to a previous method. In this work, as an application of this algorithm, we present a comparative analysis of leukemia samples with healthy hematopoietic cells and provide new insights about the relationship between chromatin structures, cell surface proteins, and symptoms in leukemia.

Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008422 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 08422&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1008422

DOI: 10.1371/journal.pcbi.1008422

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pcbi00:1008422