Can random walking on a Hi-C contact matrix lead to data quality improvement? An assessment
Yongqi Liu and
Shili Lin
PLOS ONE, 2025, vol. 20, issue 9, 1-18
Abstract:
Hi-C and single cell Hi-C (scHi-C) data are now routinely generated for studying an array of biological questions of interest, including whole genome chromatin organization to gain a better understanding of the chromosome three-dimensional hierarchical structure: compartments, Topologically Associated Domains (TADs), and long-range interactions. Due to concerns about data quality, especially for scHi-C because of its sparsity, data quality improvement is seen as a necessary step before performing analyses to answer biological questions. As such, methods have been developed accordingly, among them is a set of methods that are “random walk”- based, including random walk with a limited number of steps (RWS) and random walk with restart (RWR). Nevertheless, there is little justification for the use of such methods, nor quantification of their performance success. Taking correct identification of TADs as the end point, in this paper, we describe the characteristics of random-walk-based approaches and carry out empirical investigation for identifying TADs before and after random walks. Due to the lack of practical guidelines for choosing tuning parameters necessary for performing random walks, it is difficult to know how many steps of random walk for RWS or how small a restart probability for RWR should one choose to achieve good performance. Even in the unrealistic scenario when one has the hindsight to use the optimal parameter values, little improvement in downstream TAD analyses by first performing random walk was observed. This conclusion was based on extensive analytical analyses, simulation study, and real data applications. Therefore, the current study provides a cautionary note to researchers who may consider using random-walk-based approaches prior to downstream analyses.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0327100 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 27100&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0327100
DOI: 10.1371/journal.pone.0327100
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().