Robust regression with compositional covariates
Aditya Mishra and
Christian L. Müller
Computational Statistics & Data Analysis, 2022, vol. 165, issue C
Abstract:
Many biological high-throughput datasets, such as targeted amplicon-based and metagenomic sequencing data, are compositional. A common exploratory data analysis task is to infer robust statistical associations between high-dimensional microbial compositions and habitat- or host-related covariates. To address this, a general robust statistical regression framework RobRegCC (Robust Regression with Compositional Covariates) is proposed, which extends the linear log-contrast model by a mean shift formulation for capturing outliers. RobRegCC includes sparsity-promoting convex and non-convex penalties for parsimonious model estimation, a data-driven robust initialization procedure, and a novel robust cross-validation model selection scheme. The procedure is implemented in the R package robregcc. Extensive simulation studies show the RobRegCC's ability to perform simultaneous sparse log-contrast regression and outlier detection over a wide range of settings. To demonstrate the seamless applicability of the workflow to real data, the gut microbiome dataset from HIV patients are analyzed and robust associations between a sparse set of microbial species and host immune response from soluble CD14 measurements are inferred.
Keywords: Compositional data; Robust; Mean shift; Sparsity; Microbiome (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947321001493
Full text for ScienceDirect subscribers only.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:165:y:2022:i:c:s0167947321001493
DOI: 10.1016/j.csda.2021.107315
Access Statistics for this article
Computational Statistics & Data Analysis is currently edited by S.P. Azen
More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().