Economics at your fingertips  

A penalized regression approach for DNA copy number study using the sequencing data

Lee Jaeeun and Chen Jie ()
Additional contact information
Chen Jie: Division of Biostatistics and Data Science, Department of Population Health Sciences, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA

Statistical Applications in Genetics and Molecular Biology, 2019, vol. 18, issue 4, 14

Abstract: Modeling the high-throughput next generation sequencing (NGS) data, resulting from experiments with the goal of profiling tumor and control samples for the study of DNA copy number variants (CNVs), remains to be a challenge in various ways. In this application work, we provide an efficient method for detecting multiple CNVs using NGS reads ratio data. This method is based on a multiple statistical change-points model with the penalized regression approach, 1d fused LASSO, that is designed for ordered data in a one-dimensional structure. In addition, since the path algorithm traces the solution as a function of a tuning parameter, the number and locations of potential CNV region boundaries can be estimated simultaneously in an efficient way. For tuning parameter selection, we then propose a new modified Bayesian information criterion, called JMIC, and compare the proposed JMIC with three different Bayes information criteria used in the literature. Simulation results have shown the better performance of JMIC for tuning parameter selection, in comparison with the other three criterion. We applied our approach to the sequencing data of reads ratio between the breast tumor cell lines HCC1954 and its matched normal cell line BL 1954 and the results are in-line with those discovered in the literature.

Keywords: change point analysis; CNVs; fused LASSO; modified information criterion; next generation sequencing data; penalized regression (search for similar items in EconPapers)
Date: 2019
References: Add references at CitEc
Citations: Track citations by RSS feed

Downloads: (external link) (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link:

Ordering information: This journal article can be ordered from

DOI: 10.1515/sagmb-2018-0001

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

Page updated 2021-05-07
Handle: RePEc:bpj:sagmbi:v:18:y:2019:i:4:p:14:n:1