Data analytics for gross domestic product using random forest and extreme gradient boosting approaches: an empirical study
Elsayed A.H. Elamir
International Journal of Data Mining, Modelling and Management, 2022, vol. 14, issue 3, 269-286
Abstract:
This study aims to use the random forest and extreme gradient boosting approaches to forecast and analyse gross domestic product per capita using data from World Bank development indicators on countries level over the period 2010 to 2017. The comprehensive comparisons are executed using years before 2017 as training data and year 2017 as testing data. The root mean squares error, and the coefficient of determination are used to judge among the different models. The random forest and extreme gradient boosting achieve accuracy 97.8% and 98.1%, respectively, using coefficient of determination. The results suggest that the investment in education, labour, health, and industry as well as decreasing in inflation, interest, unemployment is necessary to enhance gross domestic product per capita. Motivating results are given by two-way interaction measure that is useful in explaining co-dependencies in the model behaviour. The strongest interactions are between trade-technology, technology-education followed by consumption-health.
Keywords: bagging; boosting; business analytics; forecast; gross domestic product; GDP; machine learning. (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=125258 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:14:y:2022:i:3:p:269-286
Access Statistics for this article
More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().