EconPapers    
Economics at your fingertips  
 

A data-driven approach for constructing mutation categories for mutational signature analysis

Gal Gilad, Mark D M Leiserson and Roded Sharan

PLOS Computational Biology, 2021, vol. 17, issue 10, 1-15

Abstract: Mutational processes shape the genomes of cancer patients and their understanding has important applications in diagnosis and treatment. Current modeling of mutational processes by identifying their characteristic signatures views each base substitution in a limited context of a single flanking base on each side. This context definition gives rise to 96 categories of mutations that have become the standard in the field, even though wider contexts have been shown to be informative in specific cases. Here we propose a data-driven approach for constructing a mutation categorization for mutational signature analysis. Our approach is based on the assumption that tumor cells that are exposed to similar mutational processes, show similar expression levels of DNA damage repair genes that are involved in these processes. We attempt to find a categorization that maximizes the agreement between mutation and gene expression data, and show that it outperforms the standard categorization over multiple quality measures. Moreover, we show that the categorization we identify generalizes to unseen data from different cancer types, suggesting that mutation context patterns extend beyond the immediate flanking bases.Author summary: Cancer is a group of genetic diseases that occur as a result of an accumulation of somatic mutations in genes that regulate cellular growth and differentiation. These mutations arise from mutagenic processes such as exposure to environmental mutagens and defective DNA damage repair pathways. Each of these processes results in a characteristic pattern of mutations, referred to as a mutational signature. These signatures reveal the mutagenic mechanisms that have influenced the development of a specific tumor, and thus provide new insights into its causes and potential treatments. Originally, a mutational signature has been defined using 96 mutation categories that take into account solely the information from the mutated base and its flanking bases. Here, we aim to challenge this arbitrary categorization, which is widely used in mutational signature analysis. We have developed a novel framework for the construction of mutation categories that is based on the assumption that the activities of DNA damage repair genes are correlated with the mutational processes that are active in a given tumor. We show that using this approach we are able to identify an alternative mutation categorization that outperforms the standard categorization with respect to multiple metrics. This categorization includes categories that account for bases that extend beyond the immediate flanking bases, suggesting that mutational signatures should be studied in broader sequence contexts.

Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009542 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 09542&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1009542

DOI: 10.1371/journal.pcbi.1009542

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pcbi00:1009542