Model selection for prognostic time-to-event gene signature discovery with applications in early breast cancer data
Ahdesmäki Miika (),
Lancashire Lee,
Proutski Vitali,
Wilson Claire,
Davison Timothy S.,
Harkin D. Paul and
Kennedy Richard D.
Additional contact information
Ahdesmäki Miika: Almac Diagnostics, 19 Seagoe Industrial Estate, BT63 5QD Craigavon, UK
Lancashire Lee: Almac Diagnostics, 19 Seagoe Industrial Estate, BT63 5QD Craigavon, UK
Proutski Vitali: Almac Diagnostics, 19 Seagoe Industrial Estate, BT63 5QD Craigavon, UK
Wilson Claire: Almac Diagnostics, 19 Seagoe Industrial Estate, BT63 5QD Craigavon, UK
Davison Timothy S.: Almac Diagnostics, 19 Seagoe Industrial Estate, BT63 5QD Craigavon, UK Queen’s University of Belfast, Centre for Cancer Research and Cell Biology, BT9 7BL Belfast, UK
Harkin D. Paul: Almac Diagnostics, 19 Seagoe Industrial Estate, BT63 5QD Craigavon, UK Queen’s University of Belfast, Centre for Cancer Research and Cell Biology, BT9 7BL Belfast, UK
Kennedy Richard D.: Almac Diagnostics, 19 Seagoe Industrial Estate, BT63 5QD Craigavon, UK Queen’s University of Belfast, Centre for Cancer Research and Cell Biology, BT9 7BL Belfast, UK
Statistical Applications in Genetics and Molecular Biology, 2013, vol. 12, issue 5, 619-635
Abstract:
Model selection between competing models is a key consideration in the discovery of prognostic multigene signatures. The use of appropriate statistical performance measures as well as verification of biological significance of the signatures is imperative to maximise the chance of external validation of the generated signatures. Current approaches in time-to-event studies often use only a single measure of performance in model selection, such as logrank test p-values, or dichotomise the follow-up times at some phase of the study to facilitate signature discovery. In this study we improve the prognostic signature discovery process through the application of the multivariate partial Cox model combined with the concordance index, hazard ratio of predictions, independence from available clinical covariates and biological enrichment as measures of signature performance. The proposed framework was applied to discover prognostic multigene signatures from early breast cancer data. The partial Cox model combined with the multiple performance measures were used in both guiding the selection of the optimal panel of prognostic genes and prediction of risk within cross validation without dichotomising the follow-up times at any stage. The signatures were successfully externally cross validated in independent breast cancer datasets, yielding a hazard ratio of 2.55 [1.44, 4.51] for the top ranking signature.
Keywords: gene signature; feature selection; model selection; prognostic biomarker; time to event analysis (search for similar items in EconPapers)
Date: 2013
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/sagmb-2012-0047 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:12:y:2013:i:5:p:619-635:n:5
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html
DOI: 10.1515/sagmb-2012-0047
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().