Motif-based models accurately predict cell type-specific distal regulatory elements
Paola Cornejo-Páramo,
Xuan Zhang,
Lithin Louis,
Zelun Li,
Yihua Yang and
Emily S. Wong ()
Additional contact information
Paola Cornejo-Páramo: Victor Chang Cardiac Research Institute
Xuan Zhang: Victor Chang Cardiac Research Institute
Lithin Louis: Victor Chang Cardiac Research Institute
Zelun Li: Victor Chang Cardiac Research Institute
Yihua Yang: Victor Chang Cardiac Research Institute
Emily S. Wong: Victor Chang Cardiac Research Institute
Nature Communications, 2025, vol. 16, issue 1, 1-15
Abstract:
Abstract Deciphering how DNA sequence specifies cell-type-specific regulatory activity is a central challenge in gene regulation. We present Bag-of-Motifs (BOM), a computational framework that represents distal cis-regulatory elements as unordered counts of transcription factor (TF) motifs. This minimalist representation, combined with gradient-boosted trees, enables the accurate prediction of cell-type-specific enhancers across mouse, human, zebrafish, and Arabidopsis datasets. Despite its simplicity, BOM outperforms more complex deep-learning models while using fewer parameters. We validate BOM’s predictions experimentally by constructing synthetic enhancers from the most predictive motifs, demonstrating that these motif sets drive cell-type-specific expression. By providing direct interpretability and broad applicability, BOM reveals a highly predictive sequence code at distal regulatory regions and offers a scalable framework for dissecting cis-regulatory grammar across diverse species and conditions.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-025-65362-2 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-65362-2
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-025-65362-2
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().