A Transparent House Price Prediction Framework Using Ensemble Learning, Genetic Algorithm-Based Tuning, and ANOVA-Based Feature Analysis
Mohammed Ibrahim Hussain,
Arslan Munir (),
Mohammad Mamun,
Safiul Haque Chowdhury,
Nazim Uddin and
Muhammad Minoar Hossain
Additional contact information
Mohammed Ibrahim Hussain: Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh
Arslan Munir: Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
Mohammad Mamun: Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh
Safiul Haque Chowdhury: Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh
Nazim Uddin: Department of ICT, Chandpur Science and Technology University, Chandpur 3600, Bangladesh
Muhammad Minoar Hossain: Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh
FinTech, 2025, vol. 4, issue 3, 1-26
Abstract:
House price prediction is crucial in real estate for informed decision-making. This paper presents an automated prediction system that combines genetic algorithms (GA) for feature optimization and Analysis of Variance (ANOVA) for statistical analysis. We apply and compare five ensemble machine learning (ML) models, namely Extreme Gradient Boosting Regression (XGBR), random forest regression (RFR), Categorical Boosting Regression (CBR), Adaptive Boosting Regression (ADBR), and Gradient Boosted Decision Trees Regression (GBDTR), on a comprehensive dataset. We used a dataset with 1000 samples with eight features and a secondary dataset for external validation with 3865 samples. Our integrated approach identifies Categorical Boosting with GA (CBRGA) as the best performer, achieving an R 2 of 0.9973 and outperforming existing state-of-the-art methods. ANOVA-based analysis further enhances model interpretability and performance by isolating key factors such as square footage and lot size. To ensure robustness and transparency, we conduct 10-fold cross-validation and employ explainable AI techniques such as Shapley Additive Explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME), providing insights into model decision-making and feature importance.
Keywords: machine learning; explainable artificial intelligence; genetic algorithms; ANOVA analysis; house price prediction (search for similar items in EconPapers)
JEL-codes: C6 F3 G O3 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2674-1032/4/3/33/pdf (application/pdf)
https://www.mdpi.com/2674-1032/4/3/33/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jfinte:v:4:y:2025:i:3:p:33-:d:1704651
Access Statistics for this article
FinTech is currently edited by Ms. Lizzy Zhou
More articles in FinTech from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().