Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data

Mallory, Xian F; Edrisi, Mohammadamin; Navin, Nicholas; Nakhleh, Luay

Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data

Xian F Mallory, Mohammadamin Edrisi, Nicholas Navin and Luay Nakhleh

PLOS Computational Biology, 2020, vol. 16, issue 7, 1-24

Abstract: Single-cell DNA sequencing technologies are enabling the study of mutations and their evolutionary trajectories in cancer. Somatic copy number aberrations (CNAs) have been implicated in the development and progression of various types of cancer. A wide array of methods for CNA detection has been either developed specifically for or adapted to single-cell DNA sequencing data. Understanding the strengths and limitations that are unique to each of these methods is very important for obtaining accurate copy number profiles from single-cell DNA sequencing data. We benchmarked three widely used methods–Ginkgo, HMMcopy, and CopyNumber–on simulated as well as real datasets. To facilitate this, we developed a novel simulator of single-cell genome evolution in the presence of CNAs. Furthermore, to assess performance on empirical data where the ground truth is unknown, we introduce a phylogeny-based measure for identifying potentially erroneous inferences. While single-cell DNA sequencing is very promising for elucidating and understanding CNAs, our findings show that even the best existing method does not exceed 80% accuracy. New methods that significantly improve upon the accuracy of these three methods are needed. Furthermore, with the large datasets being generated, the methods must be computationally efficient.Author summary: Copy number aberrations, or CNAs, refer to evolutionary events that act on cancer genomes by deleting segments of the genomes or introducing new copies of existing segments. These events have been implicated in various types of cancer; consequently, their accurate detection could shed light on the initiation and progression of tumor, as well as on the development of potential targeted therapeutics. Single-cell DNA sequencing technologies are now producing the type of data that would allow such detection at the resolution of individual cells. However, to achieve this detection task, methods have to implement several steps of “data wrangling” and dealing with technical artifacts. In this work, we benchmarked three widely used methods for CNA detection from single-cell DNA data, namely Ginkgo, HMMcopy, and CopyNumber. To accomplish this study, we developed a novel simulator and devised a phylogeny-based measure of potentially erroneous CNA calls. We find that none of these methods has high accuracy, and all of them can be computationally very demanding. These findings call for the development of more accurate and more efficient methods for CNA detection from single-cell DNA data.

Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008012 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 08012&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1008012

DOI: 10.1371/journal.pcbi.1008012

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().