Robust Regression Analysis of Copy Number Variation Data based on a Univariate Score
Glen A Satten,
Andrew S Allen,
Morna Ikeda,
Jennifer G Mulle and
Stephen T Warren
PLOS ONE, 2014, vol. 9, issue 2, 1-8
Abstract:
Motivation: The discovery that copy number variants (CNVs) are widespread in the human genome has motivated development of numerous algorithms that attempt to detect CNVs from intensity data. However, all approaches are plagued by high false discovery rates. Further, because CNVs are characterized by two dimensions (length and intensity) it is unclear how to order called CNVs to prioritize experimental validation. Results: We developed a univariate score that correlates with the likelihood that a CNV is true. This score can be used to order CNV calls in such a way that calls having larger scores are more likely to overlap a true CNV. We developed cnv.beast, a computationally efficient algorithm for calling CNVs that uses robust backward elimination regression to keep CNV calls with scores that exceed a user-defined threshold. Using an independent dataset that was measured using a different platform, we validated our score and showed that our approach performed better than six other currently-available methods. Availability: cnv.beast is available at http://www.duke.edu/~asallen/Software.html.
Date: 2014
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0086272 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 86272&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0086272
DOI: 10.1371/journal.pone.0086272
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().