A framework for block-wise missing data in multi-omics
Sergi Baena-Miret,
Ferran Reverter and
Esteban Vegas
PLOS ONE, 2024, vol. 19, issue 7, 1-18
Abstract:
High-throughput technologies have generated vast amounts of omic data. It is a consensus that the integration of diverse omics sources improves predictive models and biomarker discovery. However, managing multiple omics data poses challenges such as data heterogeneity, noise, high-dimensionality and missing data, especially in block-wise patterns. This study addresses the challenges of high dimensionality and block-wise missing data through a regularization and constrained-based approach. The methodology is implemented in the R package bwm for binary and continuous response variables, and applied to breast cancer and exposome multi-omics datasets, achieving strong performance even in scenarios with missing data present in all omics. In binary classification task, our proposed model achieves accuracy in the range of 86% to 92%, and F1 in the range of 68% to 79%. And, in regression task the correlation between true and predicted responses is in the range of 72% to 76%. However, there is a slight decline in performance metrics as the percentage of missing data increases. In scenarios where block-wise missing data affects multiple omics, the model performance actually surpasses that of scenarios where missing data is present in only one omics. One possible explanation for this might be that the other scenarios introduce a greater diversity of observation profiles, leading to a more robust model. Depending on the specific omics being studied, there is greater consistency in feature selection when comparing block-wise missing data scenarios.
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0307482 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 07482&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0307482
DOI: 10.1371/journal.pone.0307482
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().