Comprehensive Evaluation of Satellite-Based Rainfall Measurements Through Rain Gauge Validation Using Advanced Statistical Regression and Machine Learning Models by Using Python
K V Sumith ()
Additional contact information
K V Sumith: Sir M Visvesvaraya Institute of Technology
Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), 2025, vol. 39, issue 9, No 16, 4563-4587
Abstract:
Abstract The accuracy of rainfall data is crucial for climate monitoring, disaster prevention, and water resource management. This study evaluates the effectiveness of various regression and machine-learning models for rainfall prediction using satellite-based and ground-based gauge data. The models tested include Linear, Ridge, Lasso, Polynomial Regression, Random Forest, Decision Tree, Gradient Boosting Machine, Support Vector Machines, and Artificial Neural Networks. Without the normalization, Linear, Ridge, and Lasso regression models showed similar performance, with R² values of 0.60 for training and 0.57 for testing, indicating reasonable accuracy but limited generalization. Polynomial regression showed a higher R² in training (0.66), but significant overfitting was observed with a drop in test performance (R² = 0.50). Random Forest and GBM performed well in training (R² = 0.94–0.95) but showed a decline in testing (R² = 0.43–0.47), indicating some overfitting. The regression models exhibited stable performance after undergoing normalization using Min-Max and Z-score. The polynomial regression method showed increased consistency but still displayed signs of overfitting. The study found that machine learning models, particularly the Random Forest and ANN algorithm, showed improved generalization after normalization, with ANN achieving the best test R² value of 0.60. Normalization techniques, particularly Min-Max and Z-score, significantly improved model performance, with statistical analysis confirming these improvements highlights the potential of machine learning models, particularly the Random Forest and ANN algorithm, for accurate rainfall prediction, particularly in flood warning systems, irrigation planning, and water resource management.
Keywords: Satellite & Gauge Rainfall data; Python; Statistical Regression; Machine Learning Regression; Model Training & testing; A Paired t-test-Bootstrap Resampling (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11269-025-04168-9 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:waterr:v:39:y:2025:i:9:d:10.1007_s11269-025-04168-9
Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11269
DOI: 10.1007/s11269-025-04168-9
Access Statistics for this article
Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA) is currently edited by G. Tsakiris
More articles in Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA) from Springer, European Water Resources Association (EWRA)
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().