EconPapers    
Economics at your fingertips  
 

Optimizing Spatial Scales for Evaluating High-Resolution CO 2 Fossil Fuel Emissions: Multi-Source Data and Machine Learning Approach

Yujun Fang, Rong Li () and Jun Cao
Additional contact information
Yujun Fang: Faculty of Resources and Environmental Science, Hubei University, Wuhan 430062, China
Rong Li: Faculty of Resources and Environmental Science, Hubei University, Wuhan 430062, China
Jun Cao: School of Architecture and Engineering, Wuhan City Polytechnic, Wuhan 430064, China

Sustainability, 2025, vol. 17, issue 20, 1-24

Abstract: High-resolution CO 2 fossil fuel emission data are critical for developing targeted mitigation policies. As a key approach for estimating spatial distributions of CO 2 emissions, top–down methods typically rely upon spatial proxies to disaggregate administrative-level emission to finer spatial scales. However, conventional linear regression models may fail to capture complex non-linear relationships between proxies and emissions. Furthermore, methods relying on nighttime light data are mostly inadequate in representing emissions for both industrial and rural zones. To address these limitations, this study developed a multiple proxy framework integrating nighttime light, points of interest (POIs), population, road networks, and impervious surface area data. Seven machine learning algorithms—Extra-Trees, Random Forest, XGBoost, CatBoost, Gradient Boosting Decision Trees, LightGBM, and Support Vector Regression—were comprehensively incorporated to estimate high-resolution CO 2 fossil fuel emissions. Comprehensive evaluation revealed that the multiple proxy Extra-Trees model significantly outperformed the single-proxy nighttime light linear regression model at the county scale, achieving R 2 = 0.96 (RMSE = 0.52 MtCO 2 ) in cross-validation and R 2 = 0.92 (RMSE = 0.54 MtCO 2 ) on the independent test set. Feature importance analysis identified brightness of nighttime light (40.70%) and heavy industrial density (21.11%) as the most critical spatial proxies. The proposed approach also showed strong spatial consistency with the Multi-resolution Emission Inventory for China, exhibiting correlation coefficients of 0.82–0.84. This study demonstrates that integrating local multiple proxy data with machine learning corrects spatial biases inherent in traditional top–down approaches, establishing a transferable framework for high-resolution emissions mapping.

Keywords: CO 2 emissions; machine learning; multi-source data; spatial distribution; Hubei Province (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2071-1050/17/20/9009/pdf (application/pdf)
https://www.mdpi.com/2071-1050/17/20/9009/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:17:y:2025:i:20:p:9009-:d:1768922

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-10-15
Handle: RePEc:gam:jsusta:v:17:y:2025:i:20:p:9009-:d:1768922