Hold out the genome: a roadmap to solving the cis-regulatory code
Carl G. Boer () and
Jussi Taipale ()
Additional contact information
Carl G. Boer: University of British Columbia
Jussi Taipale: University of Helsinki
Nature, 2024, vol. 625, issue 7993, 41-50
Abstract:
Abstract Gene expression is regulated by transcription factors that work together to read cis-regulatory DNA sequences. The ‘cis-regulatory code’ — how cells interpret DNA sequences to determine when, where and how much genes should be expressed — has proven to be exceedingly complex. Recently, advances in the scale and resolution of functional genomics assays and machine learning have enabled substantial progress towards deciphering this code. However, the cis-regulatory code will probably never be solved if models are trained only on genomic sequences; regions of homology can easily lead to overestimation of predictive performance, and our genome is too short and has insufficient sequence diversity to learn all relevant parameters. Fortunately, randomly synthesized DNA sequences enable testing a far larger sequence space than exists in our genomes, and designed DNA sequences enable targeted queries to maximally improve the models. As the same biochemical principles are used to interpret DNA regardless of its source, models trained on these synthetic data can predict genomic activity, often better than genome-trained models. Here we provide an outlook on the field, and propose a roadmap towards solving the cis-regulatory code by a combination of machine learning and massively parallel assays using synthetic DNA.
Date: 2024
References: Add references at CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.nature.com/articles/s41586-023-06661-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:nature:v:625:y:2024:i:7993:d:10.1038_s41586-023-06661-w
Ordering information: This journal article can be ordered from
https://www.nature.com/
DOI: 10.1038/s41586-023-06661-w
Access Statistics for this article
Nature is currently edited by Magdalena Skipper
More articles in Nature from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().