Validation-based model selection for 13C metabolic flux analysis with uncertain measurement errors
Nicolas Sundqvist,
Nina Grankvist,
Jeramie Watrous,
Jain Mohit,
Roland Nilsson and
Gunnar Cedersund
PLOS Computational Biology, 2022, vol. 18, issue 4, 1-27
Abstract:
Accurate measurements of metabolic fluxes in living cells are central to metabolism research and metabolic engineering. The gold standard method is model-based metabolic flux analysis (MFA), where fluxes are estimated indirectly from mass isotopomer data with the use of a mathematical model of the metabolic network. A critical step in MFA is model selection: choosing what compartments, metabolites, and reactions to include in the metabolic network model. Model selection is often done informally during the modelling process, based on the same data that is used for model fitting (estimation data). This can lead to either overly complex models (overfitting) or too simple ones (underfitting), in both cases resulting in poor flux estimates. Here, we propose a method for model selection based on independent validation data. We demonstrate in simulation studies that this method consistently chooses the correct model in a way that is independent on errors in measurement uncertainty. This independence is beneficial, since estimating the true magnitude of these errors can be difficult. In contrast, commonly used model selection methods based on the χ2-test choose different model structures depending on the believed measurement uncertainty; this can lead to errors in flux estimates, especially when the magnitude of the error is substantially off. We present a new approach for quantification of prediction uncertainty of mass isotopomer distributions in other labelling experiments, to check for problems with too much or too little novelty in the validation data. Finally, in an isotope tracing study on human mammary epithelial cells, the validation-based model selection method identified pyruvate carboxylase as a key model component. Our results argue that validation-based model selection should be an integral part of MFA model development.Author summary: Measuring metabolic reaction fluxes in living cells is difficult, yet important. The gold standard is to label extracellular metabolites with 13C, to use mass spectrometry to find out where the 13C-atoms ends up, and finally use mathematical modelling to calculate how quickly each reaction must have flowed, for the 13C-atoms to end up like that. This measurement thus relies on usage of the right mathematical model, which must be selected among various candidate models. In this manuscript, we present a new way to do this model selection step, utilizing validation data. Using an adopted approach to calculate the uncertainty of model predictions, we identify new validation experiments, which are neither too similar, nor too dissimilar, compared to the previous training data. The model candidate that is best at predicting this new validation data is the one chosen. Tests on simulated data where the true model is known, shows that the validation-based method is robust when the magnitude of the error in the measurement uncertainty is unknown, something that conventional methods are not. This improvement is important since true uncertainties can be difficult to estimate for these data. Finally, we demonstrate how the new method can be used on real data, to identify fluxes and important reactions.
Date: 2022
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009999 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 09999&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1009999
DOI: 10.1371/journal.pcbi.1009999
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().