Machine Learning-Based Prediction of Ecosystem-Scale CO 2 Flux Measurements
Jeffrey Uyekawa,
John Leland,
Darby Bergl,
Yujie Liu,
Andrew D. Richardson () and
Benjamin Lucas ()
Additional contact information
Jeffrey Uyekawa: Department of Mathematics and Statistics, Northern Arizona University, Flagstaff, AZ 86011, USA
John Leland: Department of Mathematics and Statistics, Northern Arizona University, Flagstaff, AZ 86011, USA
Darby Bergl: Center for Ecosystem Science and Society, Northern Arizona University, Flagstaff, AZ 86011, USA
Yujie Liu: Center for Ecosystem Science and Society, Northern Arizona University, Flagstaff, AZ 86011, USA
Andrew D. Richardson: Center for Ecosystem Science and Society, Northern Arizona University, Flagstaff, AZ 86011, USA
Benjamin Lucas: Department of Mathematics and Statistics, Northern Arizona University, Flagstaff, AZ 86011, USA
Land, 2025, vol. 14, issue 1, 1-27
Abstract:
AmeriFlux is a network of hundreds of sites across the contiguous United States providing tower-based ecosystem-scale carbon dioxide flux measurements at 30 min temporal resolution. While geographically wide-ranging, over its existence the network has suffered from multiple issues including towers regularly ceasing operation for extended periods and a lack of standardization of measurements between sites. In this study, we use machine learning algorithms to predict CO 2 flux measurements at NEON sites (a subset of Ameriflux sites), creating a model to gap-fill measurements when sites are down or replace measurements when they are incorrect. Machine learning algorithms also have the ability to generalize to new sites, potentially even those without a flux tower. We compared the performance of seven machine learning algorithms using 35 environmental drivers and site-specific variables as predictors. We found that Extreme Gradient Boosting (XGBoost) consistently produced the most accurate predictions (Root Mean Squared Error of 1.81 μmolm −2 s −1 , R 2 of 0.86). The model showed excellent performance testing on sites that are ecologically similar to other sites (the Mid Atlantic, New England, and the Rocky Mountains), but poorer performance at sites with fewer ecological similarities to other sites in the data (Pacific Northwest, Florida, and Puerto Rico). The results show strong potential for machine learning-based models to make more skillful predictions than state-of-the-art process-based models, being able to estimate the multi-year mean carbon balance to within an error ±50 gCm −2 y −1 for 29 of our 44 test sites. These results have significant implications for being able to accurately predict the carbon flux or gap-fill an extended outage at any AmeriFlux site, and for being able to quantify carbon flux in support of natural climate solutions.
Keywords: carbon dioxide flux; nature-based climate solutions; machine learning; XGBoost; National Ecological Observatory Network; AmeriFlux; phenocam (search for similar items in EconPapers)
JEL-codes: Q15 Q2 Q24 Q28 Q5 R14 R52 (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2073-445X/14/1/124/pdf (application/pdf)
https://www.mdpi.com/2073-445X/14/1/124/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jlands:v:14:y:2025:i:1:p:124-:d:1563163
Access Statistics for this article
Land is currently edited by Ms. Carol Ma
More articles in Land from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().