MethylBERT enables read-level DNA methylation pattern identification and tumour deconvolution using a Transformer-based model
Yunhee Jeong (),
Clarissa Gerhäuser,
Guido Sauter,
Thorsten Schlomm,
Karl Rohr and
Pavlo Lutsik ()
Additional contact information
Yunhee Jeong: German Cancer Research Center (DKFZ)
Clarissa Gerhäuser: German Cancer Research Center (DKFZ)
Guido Sauter: University Medical Center Hamburg-Eppendorf
Thorsten Schlomm: Charité – Universitätsmedizin Berlin
Karl Rohr: Heidelberg University
Pavlo Lutsik: German Cancer Research Center (DKFZ)
Nature Communications, 2025, vol. 16, issue 1, 1-14
Abstract:
Abstract DNA methylation (DNAm) is a key epigenetic mark that shows profound alterations in cancer. Read-level methylomes enable more in-depth analyses, due to their broad genomic coverage and preservation of rare cell-type signals, compared to summarized data such as 450K/EPIC microarrays. Here, we propose MethylBERT, a Transformer-based model for read-level methylation pattern classification. MethylBERT identifies tumour-derived sequence reads based on their methylation patterns and local genomic sequence, and estimates tumour cell fractions within bulk samples. In our evaluation, MethylBERT outperforms existing deconvolution methods and demonstrates high accuracy regardless of methylation pattern complexity, read length and read coverage. Moreover, we show its applicability to cell-type deconvolution as well as non-invasive early cancer diagnostics using liquid biopsy samples. MethylBERT represents a significant advancement in read-level methylome analysis and enables accurate tumour purity estimation. The broad applicability of MethylBERT will enhance studies on both tumour and non-cancerous bulk methylomes.
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-025-55920-z Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-55920-z
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-025-55920-z
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().