Clustering Proteomics Data Using Bayesian Principal Component Analysis
Halima Bensmail (),
O. John Semmes () and
Abdelali Haoudi ()
Additional contact information
Halima Bensmail: University of Tennessee
O. John Semmes: Eastern Virginia Medical School
Abdelali Haoudi: Eastern Virginia Medical School
A chapter in Data Mining in Biomedicine, 2007, pp 339-362 from Springer
Abstract:
Abstract Bioinformatics clustering tools are useful at all levels of proteomic data analysis. Proteomics studies can provide a wealth of information and rapidly generate large quantities of data from the analysis of biological specimens from healthy and diseased individuals. The high dimensionality of data generated from these studies requires the development of improved bioinformatics tools for efficient and accurate data analysis. For proteome profiling of a particular system or organism, specialized software tools are necessary. However, there have not been significant advances in the informatics and software tools necessary to support the analysis and management of the massive amounts of data generated in the process. Clustering algorithms based on probabilistic and Bayesian models provide an alternative to heuristic algorithms. The number of diseased and non-diseased groups (number of clusters) is reduced to the choice of the number of component of a mixture of underlying probability. Bayesian approach is a tool for including information from the data to the analysis. It offers an estimation of the uncertainties of the data and the parameters involved. We present novel algorithms that cluster and derive meaningful patterns of expression from large scaled proteomics experiments. We processed raw data using principal component analysis to reduce the number of peaks. Bayesian model-based clustering algorithm was then used on the transformed data. The Bayesian model-based approach has shown a superior performance, consistently selecting the correct model and the number of clusters, thus providing a novel approach for accurate diagnosis of the disease.
Keywords: Clustering; Principal component analysis; Proteomics; Bayesian analysis (search for similar items in EconPapers)
Date: 2007
References: Add references at CitEc
Citations: View citations in EconPapers (1)
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:spochp:978-0-387-69319-4_19
Ordering information: This item can be ordered from
http://www.springer.com/9780387693194
DOI: 10.1007/978-0-387-69319-4_19
Access Statistics for this chapter
More chapters in Springer Optimization and Its Applications from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().