EconPapers    
Economics at your fingertips  
 

Assessing agreement of clustering methods with gene expression microarray data

Xueli Liu, Sheng-Chien Lee, George Casella and Gary F. Peter

Computational Statistics & Data Analysis, 2008, vol. 52, issue 12, 5356-5366

Abstract: In the rapidly evolving field of genomics, many clustering and classification methods have been developed and employed to explore patterns in gene expression data. Biologists face the choice of which clustering algorithm(s) to use and how to interpret different results from various clustering algorithms. No clear objective criteria have been developed to assess the agreement and compare the results from different clustering methods. We describe two generally applicable objective measures to quantify agreement between different clustering methods. These two measures are referred to as the local agreement measure, which is defined for each gene/subject, and the global agreement measure, which is defined for the whole gene expression experiment. The agreement measures are based on a probabilistic weighting scheme applied to the number of concordant and discordant pairs from two clustering methods. In the comparison and assessment process, newly-developed concepts are implemented under the framework of reliability of a cluster. The algorithms are illustrated by simulations and then applied to a yeast sporulation gene expression microarray data. Analysis of the sporulation data identified ~5% (23 of 477) genes which were not consistently clustered using a neural net algorithm and K-means or pam. The two agreement measures provide objective criteria to conclude whether or not two clustering methods agree with each other. Using the local agreement measure, genes of unknown function which cluster consistently can more confidently be assigned functions based on co-regulation.

Date: 2008
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (6)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167-9473(08)00302-2
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:52:y:2008:i:12:p:5356-5366

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:52:y:2008:i:12:p:5356-5366