EconPapers    
Economics at your fingertips  
 

Clustering Binary Oligonucleotide Fingerprint Vectors for DNA Clone Classification Analysis

Zhipeng Cai (), Maysam Heydari () and Guohui Lin ()
Additional contact information
Zhipeng Cai: University of Alberta
Maysam Heydari: University of Alberta
Guohui Lin: University of Alberta

Journal of Combinatorial Optimization, 2005, vol. 9, issue 2, No 5, 199-211

Abstract: Abstract We considered the problem of clustering binarized oligonucleotide fingerprints that attempts to identify clusters. Oligonucleotide fingerprinting is a powerful DNA array based method to characterize cDNA and rRNA libraries and has many applications including gene expression profiling and DNA clone classification. DNA clone classification is the main application for the problem considered in this paper. Most of the existing approaches for clustering use normalized real intensity values and thus do not treat positive and negative hybridization signals equally. This is demonstrated in a series of recent publications where a discrete approach typically useful in the classification of microbial rRNA clones has been proposed. In the discrete approach, hybridization intensities are normalized and thresholds are set such that a value of 1 represents hybridization, a value of 0 represents no hybridization, and an N represents unknown, which is also called a missing value. A combinatorial optimization problem is then formulated attempting to cluster the fingerprints and resolve the missing values simultaneously. It has been examined that missing values cause much difficulty in clustering analysis and most clustering methods are very sensitive to them. In this paper, we turned a little back to the traditional clustering problem, which takes in no missing values but with the revised goal to stabilize the number of clusters and maintain the clustering quality. We adopted the binarizing scheme used in the discrete approach as it is shown to be typically useful for the clone classifications. We formulated such a problem into another combinatorial optimization problem. The computational complexity of this new clustering problem and its relationships to the discrete approach and the traditional clustering problem were studied. We have designed an exact algorithm for the new clustering problem, which is an A* search algorithm for finding a minimum number of clusters. The experimental results on two commonly tested real datasets demonstrated that the A* search algorithm runs fast and performs better than some popular hierarchical clustering methods, in terms of separating clones that have different characteristics with respect to the given oligonucleotide probes.

Keywords: DNA array; oligonucleotide fingerprinting; DNA clone classification; clustering; combinatorial optimization; A* search; evaluation function (search for similar items in EconPapers)
Date: 2005
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10878-005-6857-3 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:jcomop:v:9:y:2005:i:2:d:10.1007_s10878-005-6857-3

Ordering information: This journal article can be ordered from
https://www.springer.com/journal/10878

DOI: 10.1007/s10878-005-6857-3

Access Statistics for this article

Journal of Combinatorial Optimization is currently edited by Thai, My T.

More articles in Journal of Combinatorial Optimization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:jcomop:v:9:y:2005:i:2:d:10.1007_s10878-005-6857-3