ACMTF-R: Supervised multi-omics data integration uncovering shared and distinct outcome-associated variation
Geert Roelof van der Ploeg,
Fred T G White,
Rasmus Riemer Jakobsen,
Johan A Westerhuis,
Anna Heintz-Buschart and
Age K Smilde
PLOS ONE, 2026, vol. 21, issue 1, 1-27
Abstract:
The rapid growth of high-dimensional biological data has necessitated advanced data fusion techniques to integrate and interpret complex multi-omics and longitudinal datasets. Shared and unshared structure across such datasets can be identified in an unsupervised manner with Advanced Coupled Matrix and Tensor Factorization (ACMTF), but this cannot be related to an outcome. Conversely, N-way Partial Least Squares (NPLS) is supervised and captures outcome-associated variation but cannot identify shared and unshared structure. To bridge the gap between data exploration and prediction, we introduce ACMTF-Regression (ACMTF-R), an extension of ACMTF that incorporates a regression step, allowing for the simultaneous decomposition of multi-way data while explicitly capturing variation associated with a dependent variable. We present a detailed mathematical formulation of ACMTF-R, including its optimisation algorithm and implementation. Through extensive simulations, we systematically evaluate its ability to recover a small y-related component shared between multiple blocks, its robustness to noise, and the impact of the tuning parameter (π) which controls the balance between data exploration and outcome prediction. Our results demonstrate that ACMTF-R can robustly identify the y-related component, correctly identifying outcome-associated shared and distinct variation, distinguishing it from existing approaches such as NPLS and ACMTF. The development of ACMTF-R was motivated by a real-world dataset investigating how maternal pre-pregnancy BMI affects the human milk microbiome, human milk metabolome, and infant faecal microbiome. Emerging evidence suggests that inter-generational transfer of maternal obesity may affect multiple omics layers, highlighting the need to identify outcome-associated variation. The applicability of ACMTF-R is therefore validated by applying it to this multi-omics dataset. ACMTF-R successfully identifies novel mother-infant relationships associated with maternal pre-pregnancy BMI, underscoring its utility in multi-omics research. Our findings establish ACMTF-R as a versatile tool for multi-way data fusion, offering new insights into complex biological systems by integrating common, local, and distinct variation in the context of a dependent variable.
Date: 2026
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0339650 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 39650&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0339650
DOI: 10.1371/journal.pone.0339650
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().