Pathogenicity and functional impact of non-frameshifting insertion/deletion variation in the human genome
Kymberleigh A Pagel,
Danny Antaki,
AoJie Lian,
Matthew Mort,
David N Cooper,
Jonathan Sebat,
Lilia M Iakoucheva,
Sean D Mooney and
Predrag Radivojac
PLOS Computational Biology, 2019, vol. 15, issue 6, 1-21
Abstract:
Differentiation between phenotypically neutral and disease-causing genetic variation remains an open and relevant problem. Among different types of variation, non-frameshifting insertions and deletions (indels) represent an understudied group with widespread phenotypic consequences. To address this challenge, we present a machine learning method, MutPred-Indel, that predicts pathogenicity and identifies types of functional residues impacted by non-frameshifting insertion/deletion variation. The model shows good predictive performance as well as the ability to identify impacted structural and functional residues including secondary structure, intrinsic disorder, metal and macromolecular binding, post-translational modifications, allosteric sites, and catalytic residues. We identify structural and functional mechanisms impacted preferentially by germline variation from the Human Gene Mutation Database, recurrent somatic variation from COSMIC in the context of different cancers, as well as de novo variants from families with autism spectrum disorder. Further, the distributions of pathogenicity prediction scores generated by MutPred-Indel are shown to differentiate highly recurrent from non-recurrent somatic variation. Collectively, we present a framework to facilitate the interrogation of both pathogenicity and the functional effects of non-frameshifting insertion/deletion variants. The MutPred-Indel webserver is available at http://mutpred.mutdb.org/.Author summary: An individual genome contains around ten thousand missense variants, hundreds of insertion/deletion variants, and dozens of protein truncating variants. Among them, non-frameshifting insertion and deletion variants exhibit diverse impact on protein sequence, encompassing alterations from a single residue to the deletion of entire functional domains. Although the majority of revealed insertion/deletions have unknown phenotypic consequences, computational variant effect prediction methods are less well-described for such variation. To this end, we develop MutPred-Indel, a machine learning method to predict the pathogenicity of non-frameshifting insertion/deletion variation and, in addition, highlight structural and functional mechanisms potentially impacted by a given variant. We identify several functionally important molecular mechanisms that are impacted differently among germline, de novo, and somatic variation in contrast to putatively neutral variation. MutPred-Indel is shown to have strong performance in pathogenicity prediction and potential to identify impacted molecular features, which collectively facilitates a deeper understanding of non-frameshifting insertion/deletion variation.
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007112 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 07112&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1007112
DOI: 10.1371/journal.pcbi.1007112
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().