EconPapers    
Economics at your fingertips  
 

A Transparent House Price Prediction Framework Using Ensemble Learning, Genetic Algorithm-Based Tuning, and ANOVA-Based Feature Analysis

Mohammed Ibrahim Hussain, Arslan Munir (), Mohammad Mamun, Safiul Haque Chowdhury, Nazim Uddin and Muhammad Minoar Hossain
Additional contact information
Mohammed Ibrahim Hussain: Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh
Arslan Munir: Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
Mohammad Mamun: Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh
Safiul Haque Chowdhury: Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh
Nazim Uddin: Department of ICT, Chandpur Science and Technology University, Chandpur 3600, Bangladesh
Muhammad Minoar Hossain: Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh

FinTech, 2025, vol. 4, issue 3, 1-26

Abstract: House price prediction is crucial in real estate for informed decision-making. This paper presents an automated prediction system that combines genetic algorithms (GA) for feature optimization and Analysis of Variance (ANOVA) for statistical analysis. We apply and compare five ensemble machine learning (ML) models, namely Extreme Gradient Boosting Regression (XGBR), random forest regression (RFR), Categorical Boosting Regression (CBR), Adaptive Boosting Regression (ADBR), and Gradient Boosted Decision Trees Regression (GBDTR), on a comprehensive dataset. We used a dataset with 1000 samples with eight features and a secondary dataset for external validation with 3865 samples. Our integrated approach identifies Categorical Boosting with GA (CBRGA) as the best performer, achieving an R 2 of 0.9973 and outperforming existing state-of-the-art methods. ANOVA-based analysis further enhances model interpretability and performance by isolating key factors such as square footage and lot size. To ensure robustness and transparency, we conduct 10-fold cross-validation and employ explainable AI techniques such as Shapley Additive Explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME), providing insights into model decision-making and feature importance.

Keywords: machine learning; explainable artificial intelligence; genetic algorithms; ANOVA analysis; house price prediction (search for similar items in EconPapers)
JEL-codes: C6 F3 G O3 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2674-1032/4/3/33/pdf (application/pdf)
https://www.mdpi.com/2674-1032/4/3/33/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jfinte:v:4:y:2025:i:3:p:33-:d:1704651

Access Statistics for this article

FinTech is currently edited by Ms. Lizzy Zhou

More articles in FinTech from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-07-19
Handle: RePEc:gam:jfinte:v:4:y:2025:i:3:p:33-:d:1704651