Predicting Countries’ Development Levels Using the Decision Tree and Random Forest Methods
Batuhan Özkan (),
CoÅŸkun Parim () and
Erhan Çene ()
Additional contact information
Batuhan Özkan: Yıldız Teknik Üniversitesi, Fen Edebiyat Fakültesi, İstatistik Bölümü, İstanbul, Türkiye.
Coşkun Parim: Yıldız Teknik Üniversitesi, Fen Edebiyat Fakültesi, İstatistik Bölümü, İstanbul, Türkiye.
Erhan Çene: Yıldız Teknik Üniversitesi, Fen Edebiyat Fakültesi, İstatistik Bölümü, İstanbul, Türkiye.
EKOIST Journal of Econometrics and Statistics, 2023, vol. 0, issue 38, 87-104
Abstract:
A very close relationship exists between countries’ development levels and economic level. Countries can be examined according to various criteria and evaluated under different groups based on their level of development, from underdeveloped to highly developed. Socioeconomic factors generally play a decisive role in determining countries’ levels of development. Although the level of development is determined with the help of socioeconomic variables, different organizations (e.g., United Nations [UN], International Monetary Fund [IMF]) may make country classifications with different methods. This situation causes a country’s development level to occur in different categories based on the method used and the organization that performed it. The aim of this study is to propose a machine learning model that predicts the development level for 193 countries. Development level consists of the categories of high income, upper middle income, lower middle income, and low income. The 26 variables that affect countries’ development levels were obtained from the World Development Indicators (WDI) database. Firstly, random forest based variable importance was used to determine the variables which have the most important effects on countries’ development levels. Afterwards, countries’ development levels were classified using decision trees and random forest algorithms with the most important variables selected through variable importance. The model composed with the random forest algorithm was determined to have correctly classified countries’ development levels at an accuracy of 70%. In addition, the findings show the variables of adolescent fertility rate, total fertility rate, and the share of agriculture, forestry, and fisheries in gross domestic product GDP) to be the most important variables affecting countries’ development levels.
Keywords: Development Level; Decision Tree; Random Forest; Fertility Rate; Machine Learning (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://cdn.istanbul.edu.tr/file/JTA6CLJ8T5/E34BEDEC8771401BB444F47B6A7BCC48 (application/pdf)
https://iupress.istanbul.edu.tr/tr/journal/ekoist/ ... iyle-tahmin-edilmesi (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ist:ekoist:v:0:y:2023:i:38:p:87-104
DOI: 10.26650/ekoist.2023.38.1172190
Access Statistics for this article
EKOIST Journal of Econometrics and Statistics is currently edited by Aycan HEPSAĞ
More articles in EKOIST Journal of Econometrics and Statistics from Istanbul University, Faculty of Economics Contact information at EDIRC.
Bibliographic data for series maintained by Istanbul University Press Operational Team (Ertuğrul YAŞAR) ().