A boosting approach for adapting the sparsity of risk prediction signatures based on different molecular levels
Sariyar Murat (),
Schumacher Martin and
Binder Harald
Additional contact information
Sariyar Murat: Institute of Medical Biostatistics, Epidemiology and Informatics, Medical Center of the Johannes Gutenberg University, 55131 Mainz, Germany Institute of Pathology, Charite – University Medicine Berlin, Campus Benjamin Franklin, Berlin 12200, Germany
Schumacher Martin: Institute of Medical Biometry and Medical Informatics, University Medical Center Freiburg, Germany
Binder Harald: Institute of Medical Biostatistics, Epidemiology and Informatics, Medical Center of the Johannes Gutenberg University, 55131 Mainz, Germany
Statistical Applications in Genetics and Molecular Biology, 2014, vol. 13, issue 3, 343-357
Abstract:
Risk prediction models can link high-dimensional molecular measurements, such as DNA methylation, to clinical endpoints. For biological interpretation, often a sparse fit is desirable. Different molecular aggregation levels, such as considering DNA methylation at the CpG, gene, or chromosome level, might demand different degrees of sparsity. Hence, model building and estimation techniques should be able to adapt their sparsity according to the setting. Additionally, underestimation of coefficients, which is a typical problem of sparse techniques, should also be addressed. We propose a comprehensive approach, based on a boosting technique that allows a flexible adaptation of model sparsity and addresses these problems in an integrative way. The main motivation is to have an automatic sparsity adaptation. In a simulation study, we show that this approach reduces underestimation in sparse settings and selects more adequate model sizes than the corresponding non-adaptive boosting technique in non-sparse settings. Using different aggregation levels of DNA methylation data from a study in kidney carcinoma patients, we illustrate how automatically selected values of the sparsity tuning parameter can reflect the underlying structure of the data. In addition to that, prediction performance and variable selection stability is compared to the non-adaptive boosting approach.
Keywords: componentwise boosting; Cox model; CpG methylation; regularization; sparse models (search for similar items in EconPapers)
Date: 2014
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/sagmb-2013-0050 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:13:y:2014:i:3:p:15:n:4
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html
DOI: 10.1515/sagmb-2013-0050
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().