Powerful large scale inference in high dimensional mediation analysis
Asmita Roy and
Xianyang Zhang
PLOS Computational Biology, 2026, vol. 22, issue 1, 1-23
Abstract:
In genome-wide epigenetic studies, determining how exposures (e.g., Single Nucleotide Polymorphisms) affect outcomes (e.g., gene expression) through intermediate variables, such as DNA methylation, is a key challenge. Mediation analysis provides a framework to identify these causal pathways; however, testing for mediation effects involves a complex composite null hypothesis. Existing methods, such as Sobel’s test or the Max-P test, are often underpowered in this context because they rely on null distributions determined under only a subset of the null space and are not optimized for the multiple testing burden inherent in high-dimensional data. To address these limitations, we introduce MLFDR (Mediation Analysis using Local False Discovery Rates), a novel method for high-dimensional mediation analysis. MLFDR leverages local false discovery rates, calculated from the coefficients of structural equation models, to construct an optimal rejection region. We demonstrate theoretically and through simulation that MLFDR asymptotically controls the false discovery rate and achieves superior statistical power compared to recent high-dimensional mediation methods. In real data applications, MLFDR identified 20%–50% more significant mediators than existing methods, demonstrating its ability to uncover biological signals missed by conventional approaches.Author summary: The paper presents a novel approach to high-dimensional mediation analysis through a local false discovery rate (MLFDR) screening algorithm. It addresses the limitations of traditional methods like Sobel’s test and maxP, which are underpowered in high dimensional setting. We extend local FDR principles to composite null hypotheses, and derive a screening rule with a closed-form expression for false discovery proportion. We also show that MLFDR has comparable or better results than two recently-proposed methods, MDACT [30], HDMT [3] across a wide range of data types and models. We also provide a two-step global latent factor adjustment using surrogate variable analysis [9].
Date: 2026
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013880 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 13880&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1013880
DOI: 10.1371/journal.pcbi.1013880
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().