EconPapers    
Economics at your fingertips  
 

A foundation model of transcription across human cell types

Xi Fu (), Shentong Mo, Alejandro Buendia, Anouchka P. Laurent, Anqi Shao, Maria del Mar Alvarez-Torres, Tianji Yu, Jimin Tan, Jiayu Su, Romella Sagatelian, Adolfo A. Ferrando, Alberto Ciccia, Yanyan Lan, David M. Owens, Teresa Palomero, Eric P. Xing () and Raul Rabadan ()
Additional contact information
Xi Fu: Columbia University
Shentong Mo: Mohamed bin Zayed University of Artificial Intelligence
Alejandro Buendia: Columbia University
Anouchka P. Laurent: Columbia University
Anqi Shao: Columbia University
Maria del Mar Alvarez-Torres: Columbia University
Tianji Yu: Columbia University
Jimin Tan: New York University Grossman School of Medicine
Jiayu Su: Columbia University
Romella Sagatelian: Columbia University
Adolfo A. Ferrando: Columbia University
Alberto Ciccia: Columbia University
Yanyan Lan: Tsinghua University
David M. Owens: Columbia University
Teresa Palomero: Columbia University
Eric P. Xing: Mohamed bin Zayed University of Artificial Intelligence
Raul Rabadan: Columbia University

Nature, 2025, vol. 637, issue 8047, 965-973

Abstract: Abstract Transcriptional regulation, which involves a complex interplay between regulatory sequences and proteins, directs all biological processes. Computational models of transcription lack generalizability to accurately extrapolate to unseen cell types and conditions. Here we introduce GET (general expression transformer), an interpretable foundation model designed to uncover regulatory grammars across 213 human fetal and adult cell types1,2. Relying exclusively on chromatin accessibility data and sequence information, GET achieves experimental-level accuracy in predicting gene expression even in previously unseen cell types3. GET also shows remarkable adaptability across new sequencing platforms and assays, enabling regulatory inference across a broad range of cell types and conditions, and uncovers universal and cell-type-specific transcription factor interaction networks. We evaluated its performance in prediction of regulatory activity, inference of regulatory elements and regulators, and identification of physical interactions between transcription factors and found that it outperforms current models4 in predicting lentivirus-based massively parallel reporter assay readout5,6. In fetal erythroblasts7, we identified distal (greater than 1 Mbp) regulatory regions that were missed by previous models, and, in B cells, we identified a lymphocyte-specific transcription factor–transcription factor interaction that explains the functional significance of a leukaemia risk predisposing germline mutation8–10. In sum, we provide a generalizable and accurate model for transcription together with catalogues of gene regulation and transcription factor interactions, all with cell type specificity.

Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41586-024-08391-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:nature:v:637:y:2025:i:8047:d:10.1038_s41586-024-08391-z

Ordering information: This journal article can be ordered from
https://www.nature.com/

DOI: 10.1038/s41586-024-08391-z

Access Statistics for this article

Nature is currently edited by Magdalena Skipper

More articles in Nature from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:nature:v:637:y:2025:i:8047:d:10.1038_s41586-024-08391-z