A Semiparametric Kernel Independence Test With Application to Mutational Signatures
DongHyuk Lee and
Bin Zhu
Journal of the American Statistical Association, 2021, vol. 116, issue 536, 1648-1661
Abstract:
Cancers arise owing to somatic mutations, and the characteristic combinations of somatic mutations form mutational signatures. Despite many mutational signatures being identified, mutational processes underlying a number of mutational signatures remain unknown, which hinders the identification of interventions that may reduce somatic mutation burdens and prevent the development of cancer. We demonstrate that the unknown cause of a mutational signature can be inferred by the associated signatures with known etiology. However, existing association tests are not statistically powerful due to excess zeros in mutational signatures data. To address this limitation, we propose a semiparametric kernel independence test (SKIT). The SKIT statistic is defined as the integrated squared distance between mixed probability distributions and is decomposed into four disjoint components to pinpoint the source of dependency. We derive the asymptotic null distribution and prove the asymptotic convergence of power. Due to slow convergence to the asymptotic null distribution, a bootstrap method is employed to compute p-values. Simulation studies demonstrate that when zeros are prevalent, SKIT is more resilient to power loss than existing tests and robust to random errors. We applied SKIT to The Cancer Genome Atlas mutational signatures data for over 9000 tumors across 32 cancer types, and identified a novel association between signature 17 curated in the Catalogue of Somatic Mutations in Cancer and apolipoprotein B mRNA editing enzyme (APOBEC) signatures in gastrointestinal cancers. It indicates that APOBEC activity is likely associated with the unknown cause of signature 17. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
Date: 2021
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/01621459.2020.1871357 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:jnlasa:v:116:y:2021:i:536:p:1648-1661
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/UASA20
DOI: 10.1080/01621459.2020.1871357
Access Statistics for this article
Journal of the American Statistical Association is currently edited by Xuming He, Jun Liu, Joseph Ibrahim and Alyson Wilson
More articles in Journal of the American Statistical Association from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().