EconPapers    
Economics at your fingertips  
 

A Density-Based Random Forest for Imbalanced Data Classification

Jia Dong and Quan Qian
Additional contact information
Jia Dong: School of Computer Engineering & Science, Shanghai University, 99 Shangda Rd., Shanghai 200444, China
Quan Qian: School of Computer Engineering & Science, Shanghai University, 99 Shangda Rd., Shanghai 200444, China

Future Internet, 2022, vol. 14, issue 3, 1-20

Abstract: Many machine learning problem domains, such as the detection of fraud, spam, outliers, and anomalies, tend to involve inherently imbalanced class distributions of samples. However, most classification algorithms assume equivalent sample sizes for each class. Therefore, imbalanced classification datasets pose a significant challenge in prediction modeling. Herein, we propose a density-based random forest algorithm (DBRF) to improve the prediction performance, especially for minority classes. DBRF is designed to recognize boundary samples as the most difficult to classify and then use a density-based method to augment them. Subsequently, two different random forest classifiers were constructed to model the augmented boundary samples and the original dataset dependently, and the final output was determined using a bagging technique. A real-world material classification dataset and 33 open public imbalanced datasets were used to evaluate the performance of DBRF. On the 34 datasets, DBRF could achieve improvements of 2–15% over random forest in terms of the F1-measure and G-mean. The experimental results proved the ability of DBRF to solve the problem of classifying objects located on the class boundary, including objects of minority classes, by taking into account the density of objects in space.

Keywords: density-based random forest; imbalanced data classification; boundary and density domain partition (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2022
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/1999-5903/14/3/90/pdf (application/pdf)
https://www.mdpi.com/1999-5903/14/3/90/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:14:y:2022:i:3:p:90-:d:770486

Access Statistics for this article

Future Internet is currently edited by Ms. Grace You

More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jftint:v:14:y:2022:i:3:p:90-:d:770486