HighDimMixedModels.jl: Robust high-dimensional mixed-effects models across omics data
Evan Gorstein,
Rosa Aghdam and
Claudia Solís-Lemus
PLOS Computational Biology, 2025, vol. 21, issue 1, 1-28
Abstract:
High-dimensional mixed-effects models are an increasingly important form of regression in which the number of covariates rivals or exceeds the number of samples, which are collected in groups or clusters. The penalized likelihood approach to fitting these models relies on a coordinate descent algorithm that lacks guarantees of convergence to a global optimum. Here, we empirically study the behavior of this algorithm on simulated and real examples of three types of data that are common in modern biology: transcriptome, genome-wide association, and microbiome data. Our simulations provide new insights into the algorithm’s behavior in these settings, and, comparing the performance of two popular penalties, we demonstrate that the smoothly clipped absolute deviation (SCAD) penalty consistently outperforms the least absolute shrinkage and selection operator (LASSO) penalty in terms of both variable selection and estimation accuracy across omics data. To empower researchers in biology and other fields to fit models with the SCAD penalty, we implement the algorithm in a Julia package, HighDimMixedModels.jl.Author summary: High-dimensional, clustered data are increasingly common in modern omics. In our study, we focus on the penalized likelihood approach to fitting mixed-effects models to these data, employing a coordinate descent (CD) algorithm to minimize the objective function. Although CD is a common optimization scheme, its convergence in this setting lacks guarantees, prompting our empirical investigation of its behavior when applied to transcriptome, genome-wide association, and microbiome datasets. We evaluate the model and algorithm’s performance on simulations of these studies and subsequently apply it to real examples of each. To help facilitate the practical application of these models and further research, we have implemented the algorithm in an open-source Julia package, HighDimMixedModels.jl. This package provides implementations of both the least absolute shrinkage and selection operator (LASSO) and the smoothly clipped absolute deviation (SCAD) penalty, and having tested its performance on various omics data sets, we hope that it offers a user-friendly solution for researchers in biology.
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1012143 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 12143&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1012143
DOI: 10.1371/journal.pcbi.1012143
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().