Lasso Logistic Regression, GSoft and the Cyclic Coordinate Descent Algorithm: Application to Gene Expression Data
Garcia-Magariños Manuel,
Antoniadis Anestis,
Cao Ricardo and
González-Manteiga Wenceslao
Additional contact information
Garcia-Magariños Manuel: Universidade de Santiago de Compostela
Antoniadis Anestis: Université Joseph Fourier
Cao Ricardo: Universidade da Coruña
González-Manteiga Wenceslao: Universidade de Santiago de Compostela
Statistical Applications in Genetics and Molecular Biology, 2010, vol. 9, issue 1, 30
Abstract:
Statistical methods generating sparse models are of great value in the gene expression field, where the number of covariates (genes) under study moves about the thousands while the sample sizes seldom reach a hundred of individuals. For phenotype classification, we propose different lasso logistic regression approaches with specific penalizations for each gene. These methods are based on a generalized soft-threshold (GSoft) estimator. We also show that a recent algorithm for convex optimization, namely, the cyclic coordinate descent (CCD) algorithm, provides with a way to solve the optimization problem significantly faster than with other competing methods. Viewing GSoft as an iterative thresholding procedure allows us to get the asymptotic properties of the resulting estimates in a straightforward manner. Results are obtained for simulated and real data. The leukemia and colon datasets are commonly used to evaluate new statistical approaches, so they come in useful to establish comparisons with similar methods. Furthermore, biological meaning is extracted from the leukemia results, and compared with previous studies. In summary, the approaches presented here give rise to sparse, interpretable models that are competitive with similar methods developed in the field.
Keywords: penalized regression; logistic regression; lasso; GSoft; CCD algorithm; optimization; gene expression (search for similar items in EconPapers)
Date: 2010
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://doi.org/10.2202/1544-6115.1536 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:9:y:2010:i:1:n:30
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html
DOI: 10.2202/1544-6115.1536
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().