Regularized estimation in sparse high-dimensional multivariate regression, with application to a DNA methylation study
Zhang Haixiang,
Zheng Yinan,
Yoon Grace,
Zhang Zhou,
Gao Tao,
Joyce Brian,
Zhang Wei,
Schwartz Joel,
Vokonas Pantel,
Colicino Elena,
Baccarelli Andrea,
Hou Lifang and
Liu Lei ()
Additional contact information
Zhang Haixiang: Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
Zheng Yinan: Department of Preventive Medicine, Northwestern University, Chicago, IL 60611, USA
Yoon Grace: Department of Statistics, Northwestern University, Chicago, IL 60611, USA
Zhang Zhou: Department of Preventive Medicine, Northwestern University, Chicago, IL 60611, USA
Gao Tao: Department of Preventive Medicine, Northwestern University, Chicago, IL 60611, USA
Joyce Brian: Department of Preventive Medicine, Northwestern University, Chicago, IL 60611, USA
Zhang Wei: Department of Preventive Medicine, Northwestern University, Chicago, IL 60611, USA
Schwartz Joel: Department of Environmental Health, Harvard University, Boston, MA 02115, USA
Vokonas Pantel: Department of Preventive Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Colicino Elena: Normative Aging Study, Veterans Affairs Boston Healthcare System and Boston University, Boston, MA 02118, USA
Baccarelli Andrea: Department of Environmental Health Sciences, Columbia University, New York, NY 10032, USA
Hou Lifang: Department of Preventive Medicine, Northwestern University, Chicago, IL 60611, USA
Liu Lei: Department of Preventive Medicine, Northwestern University, Chicago, IL 60611, USA
Statistical Applications in Genetics and Molecular Biology, 2017, vol. 16, issue 3, 159-171
Abstract:
In this article, we consider variable selection for correlated high dimensional DNA methylation markers as multivariate outcomes. A novel weighted square-root LASSO procedure is proposed to estimate the regression coefficient matrix. A key feature of this method is tuning-insensitivity, which greatly simplifies the computation by obviating cross validation for penalty parameter selection. A precision matrix obtained via the constrained ℓ1 minimization method is used to account for the within-subject correlation among multivariate outcomes. Oracle inequalities of the regularized estimators are derived. The performance of our proposed method is illustrated via extensive simulation studies. We apply our method to study the relation between smoking and high dimensional DNA methylation markers in the Normative Aging Study (NAS).
Keywords: high-dimensional responses; multivariate regression; oracle inequality; tuning-insensitive; weighted square-root LASSO (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/sagmb-2016-0073 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:16:y:2017:i:3:p:159-171:n:4
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html
DOI: 10.1515/sagmb-2016-0073
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().