EconPapers    
Economics at your fingertips  
 

A Cross-Validation Study to Select a Classification Procedure for Clinical Diagnosis Based on Proteomic Mass Spectrometry

Valkenborg Dirk, Suzy Van Sanden, Lin Dan, Kasim Adetayo, Zhu Qi, Haldermans Philippe, Jansen Ivy, Shkedy Ziv and Burzykowski Tomasz
Additional contact information
Valkenborg Dirk: Hasselt University, Center for Statistics
Suzy Van Sanden: Hasselt University, Center for Statistics
Lin Dan: Hasselt University, Center for Statistics
Kasim Adetayo: Hasselt University, Center for Statistics
Zhu Qi: Hasselt University, Center for Statistics
Haldermans Philippe: Hasselt University, Center for Statistics
Jansen Ivy: Hasselt University, Center for Statistics
Shkedy Ziv: Hasselt University, Center for Statistics
Burzykowski Tomasz: Hasselt University, Center for Statistics

Statistical Applications in Genetics and Molecular Biology, 2008, vol. 7, issue 2, 22

Abstract: We present an approach to construct a classification rule based on the mass spectrometry data provided by the organizers of the "Classification Competition on Clinical Mass Spectrometry Proteomic Diagnosis Data." Before constructing a classification rule, we attempted to pre-process the data and to select features of the spectra that were likely due to true biological signals (i.e., peptides/proteins). As a result, we selected a set of 92 features. To construct the classification rule, we considered eight methods for selecting a subset of the features, combined with seven classification methods. The performance of the resulting 56 combinations was evaluated by using a cross-validation procedure with 1000 re-sampled data sets. The best result, as indicated by the lowest overall misclassification rate, was obtained by using the whole set of 92 features as the input for a support-vector machine (SVM) with a linear kernel. This method was therefore used to construct the classification rule. For the training data set, the total error rate for the classification rule, as estimated by using leave-one-out cross-validation, was equal to 0.16, with the sensitivity and specificity equal to 0.87 and 0.82, respectively.

Keywords: proteomic MALDI-TOFMS preprocessing; feature selection; two-stage cross-validation; classification for clinical diagnosis (search for similar items in EconPapers)
Date: 2008
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://doi.org/10.2202/1544-6115.1363 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:7:y:2008:i:2:n:12

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.2202/1544-6115.1363

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:7:y:2008:i:2:n:12