SUITOR: Selecting the number of mutational signatures through cross-validation
Donghyuk Lee,
Difei Wang,
Xiaohong R Yang,
Jianxin Shi,
Maria Teresa Landi and
Bin Zhu
PLOS Computational Biology, 2022, vol. 18, issue 4, 1-27
Abstract:
For de novo mutational signature analysis, the critical first step is to decide how many signatures should be expected in a cancer genomics study. An incorrect number could mislead downstream analyses. Here we present SUITOR (Selecting the nUmber of mutatIonal signaTures thrOugh cRoss-validation), an unsupervised cross-validation method that requires little assumptions and no numerical approximations to select the optimal number of signatures without overfitting the data. In vitro studies and in silico simulations demonstrated that SUITOR can correctly identify signatures, some of which were missed by other widely used methods. Applied to 2,540 whole-genome sequenced tumors across 22 cancer types, SUITOR selected signatures with the smallest prediction errors and almost all signatures of breast cancer selected by SUITOR were validated in an independent breast cancer study. SUITOR is a powerful tool to select the optimal number of mutational signatures, facilitating downstream analyses with etiological or therapeutic importance.Author summary: Mutational signatures are the footprints of exogenous exposures and endogenous mutational processes on the cancer genomes. To estimate de novo mutational signatures, the first step is to decide how many signatures should be extracted in a cancer genomics study, which determines downstream analytical steps and has been insufficiently studied. We developed SUITOR, an unsupervised cross-validation method to select the optimal number of signatures without overfitting the data. We demonstrated SUITOR’s superior performance using in vitro experimental studies, in silico simulations and in vivo pan-cancer applications of 2,540 whole-genome sequenced tumors across 22 cancer types, and validated signatures of breast cancer in additional 440 breast tumors. SUITOR advances the methodological frontier of identifying de novo mutational signatures and would help discover the causes of cancer and the means of cancer prevention and treatment.
Date: 2022
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009309 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 09309&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1009309
DOI: 10.1371/journal.pcbi.1009309
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().