EconPapers    
Economics at your fingertips  
 

Distance-correlation based gene set analysis in longitudinal studies

Sun Jiehuan, Herazo-Maya Jose D., Huang Xiu, Kaminski Naftali and Zhao Hongyu ()
Additional contact information
Sun Jiehuan: Department of Biostatistics, Yale School of Public Health, New Haven, CT 06510, USA
Herazo-Maya Jose D.: Internal Medicine: Pulmonary, Critical Care and Sleep Medicine, Yale School of Medcine, New Haven, CT 06519, USA
Huang Xiu: Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
Kaminski Naftali: Internal Medicine: Pulmonary, Critical Care and Sleep Medicine, Yale School of Medcine, New Haven, CT 06519, USA
Zhao Hongyu: Department of Biostatistics, Yale School of Public Health, New Haven, CT 06510, USA

Statistical Applications in Genetics and Molecular Biology, 2018, vol. 17, issue 1, 11

Abstract: Longitudinal gene expression profiles of subjects are collected in some clinical studies to monitor disease progression and understand disease etiology. The identification of gene sets that have coordinated changes with relevant clinical outcomes over time from these data could provide significant insights into the molecular basis of disease progression and lead to better treatments. In this article, we propose a Distance-Correlation based Gene Set Analysis (dcGSA) method for longitudinal gene expression data. dcGSA is a non-parametric approach, statistically robust, and can capture both linear and nonlinear relationships between gene sets and clinical outcomes. In addition, dcGSA is able to identify related gene sets in cases where the effects of gene sets on clinical outcomes differ across subjects due to the subject heterogeneity, remove the confounding effects of some unobserved time-invariant covariates, and allow the assessment of associations between gene sets and multiple related outcomes simultaneously. Through extensive simulation studies, we demonstrate that dcGSA is more powerful of detecting relevant genes than other commonly used gene set analysis methods. When dcGSA is applied to a real dataset on systemic lupus erythematosus, we are able to identify more disease related gene sets than other methods.

Keywords: distance correlation; gene set analysis; longitudinal gene expression study (search for similar items in EconPapers)
Date: 2018
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1515/sagmb-2017-0053 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:17:y:2018:i:1:p:11:n:2

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.1515/sagmb-2017-0053

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:17:y:2018:i:1:p:11:n:2