A comprehensive benchmark of single-cell Hi-C embedding tools
Dylan Plummer,
Xiuyuan Lang,
Shanshan Zhang,
Yan Li,
Jing Li () and
Fulai Jin ()
Additional contact information
Dylan Plummer: Case Western Reserve University
Xiuyuan Lang: Case Western Reserve University
Shanshan Zhang: Case Western Reserve University
Yan Li: Case Western Reserve University
Jing Li: Case Western Reserve University
Fulai Jin: Case Western Reserve University
Nature Communications, 2025, vol. 16, issue 1, 1-19
Abstract:
Abstract Embedding is the key step in single-cell Hi-C (scHi-C) analysis which relies on capturing biological meaningful heterogeneity at various levels of genome architecture. To understand the strength and limitations of existing tools in various applications, here we use ten scHi-C datasets to benchmark thirteen embedding tools including Va3DE, a new convolutional neural network model that can accommodate large cell numbers. We built a software framework to decouple the preprocessing options of existing tools and found that no single tool works best across all datasets under default settings. The difficulty levels and preferred resolutions are different between benchmark datasets, and the choice of data representation and preprocessing strongly impact the embedding performance. Embedding cells from early embryonic stages relies on long-range compartment-scale contacts, but resolving cell cycle phases and complex tissue requires short-range loop-scale contacts. Both random-walk and inverse document frequency (IDF) transformation prefers long-range “compartment-scale” over short-range “loop-scale” embedding, while deep-learning methods better overcome sparsity at both scales and are more versatile with different resolutions. Finally, “diagonal integration” with independent data modal is a promising approach to distinguish similar cell subpopulations. Our findings underscore the significance of appropriate priors for scHi-C embedding and also offer insights into genome architecture heterogeneity.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-025-64186-4 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-64186-4
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-025-64186-4
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().