EconPapers    
Economics at your fingertips  
 

WAVECNV: A New Approach for Detecting Copy Number Variation by Wavelet Clustering

Yang Guo, Shuzhen Wang, A. K. Alvi Haque and Xiguo Yuan
Additional contact information
Yang Guo: The School of Computer Science and Technology, Xidian University, Xi’an 710071, China
Shuzhen Wang: The School of Computer Science and Technology, Xidian University, Xi’an 710071, China
A. K. Alvi Haque: The School of Computer Science and Technology, Xidian University, Xi’an 710071, China
Xiguo Yuan: The School of Computer Science and Technology, Xidian University, Xi’an 710071, China

Mathematics, 2022, vol. 10, issue 12, 1-11

Abstract: Copy number variation (CNV) detection based on second-generation sequencing technology is the basis of much gene research, but the read depth is affected by mapping errors, repeated reads, and GC bias. The existing methods have low sensitivity to variation regions with a short length and small variation range. Therefore, it is necessary to improve the sensitivity of algorithms to short-variation fragments. This study proposes a new CNV-detection method named WAVECNV to solve this issue. The algorithm uses wavelet clustering to process the read depth and determine the normal cluster and abnormal cluster according to the size of the cluster. Then, according to the distance between genome bins and normal clusters, the outlier of each genome bin is evaluated. Finally, a statistical model is established, and the p -value test is used for calling CNVs. Through this method, the information of the short variation region is retained. WAVECNV was tested and compared with peer methods in terms of simulated data and real cancer-sequencing data. The results show that the sensitivity of WAVECNV is better than the existing methods. It also has high precision in data with low purity and coverage. In real data experiments, WAVECNV can detect more cancer genes than existing methods. Therefore, this method can be regarded as a conventional method in the field of genomic mutation analysis of cancer samples.

Keywords: copy number variations; next-generation sequencing data; outlier detection; tumor purity; wavelet (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/10/12/2151/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/12/2151/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:12:p:2151-:d:843394

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager (indexing@mdpi.com).

 
Page updated 2024-12-28
Handle: RePEc:gam:jmathe:v:10:y:2022:i:12:p:2151-:d:843394