Machine Learning Approaches for Predicting Maize Biomass Yield: Leveraging Feature Engineering and Comprehensive Data Integration
Maryam Abbasi,
Paulo Váz,
José Silva and
Pedro Martins ()
Additional contact information
Maryam Abbasi: Applied Research Institute, Polytechnic of Coimbra, 3000 Coimbra, Portugal
Paulo Váz: Research Center in Digital Services, Polytechnic of Viseu, 3500 Viseu, Portugal
José Silva: Research Center in Digital Services, Polytechnic of Viseu, 3500 Viseu, Portugal
Pedro Martins: Research Center in Digital Services, Polytechnic of Viseu, 3500 Viseu, Portugal
Sustainability, 2025, vol. 17, issue 1, 1-15
Abstract:
The efficient prediction of corn biomass yield is critical for optimizing crop production and addressing global challenges in sustainable agriculture and renewable energy. This study employs advanced machine learning techniques, including Gradient Boosting Machines (GBMs), Random Forests (RFs), Support Vector Machines (SVMs), and Artificial Neural Networks (ANNs), integrated with comprehensive environmental, soil, and crop management data from key agricultural regions in the United States. A novel framework combines feature engineering, such as the creation of a Soil Fertility Index (SFI) and Growing Degree Days (GDDs), and the incorporation of interaction terms to address complex non-linear relationships between input variables and biomass yield. We conduct extensive sensitivity analysis and employ SHAP (SHapley Additive exPlanations) values to enhance model interpretability, identifying SFI, GDDs, and cumulative rainfall as the most influential features driving yield outcomes. Our findings highlight significant synergies among these variables, emphasizing their critical role in rural environmental governance and precision agriculture. Furthermore, an ensemble approach combining GBMs, RFs, and ANNs outperformed individual models, achieving an RMSE of 0.80 t/ha and R 2 of 0.89. These results underscore the potential of hybrid modeling for real-world applications in sustainable farming practices. Addressing the concerns of passive farmer participation, we propose targeted incentives, education, and institutional support mechanisms to enhance stakeholder collaboration in rural environmental governance. While the models assume rational decision-making, the inclusion of cultural and political factors warrants further investigation to improve the robustness of the framework. Additionally, a map of the study region and improved visualizations of feature importance enhance the clarity and relevance of our findings. This research contributes to the growing body of knowledge on predictive modeling in agriculture, combining theoretical rigor with practical insights to support policymakers and stakeholders in optimizing resource use and addressing environmental challenges. By improving the interpretability and applicability of machine learning models, this study provides actionable strategies for enhancing crop yield predictions and advancing rural environmental governance.
Keywords: biomass yield prediction; machine learning; gradient boosting machines; random forest; support vector machines; soil fertility index; feature engineering; sensitivity analysis; sustainable agriculture; renewable energy; crop management (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/2071-1050/17/1/256/pdf (application/pdf)
https://www.mdpi.com/2071-1050/17/1/256/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:17:y:2025:i:1:p:256-:d:1558785
Access Statistics for this article
Sustainability is currently edited by Ms. Alexandra Wu
More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().