EconPapers    
Economics at your fingertips  
 

Imbalanced Data Classification Based on Improved Random-SMOTE and Feature Standard Deviation

Ying Zhang, Li Deng () and Bo Wei
Additional contact information
Ying Zhang: School of Science, Zhejiang Sci-Tech University, Hangzhou 310018, China
Li Deng: School of Science, Zhejiang Sci-Tech University, Hangzhou 310018, China
Bo Wei: School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China

Mathematics, 2024, vol. 12, issue 11, 1-17

Abstract: Oversampling techniques are widely used to rebalance imbalanced datasets. However, most of the oversampling methods may introduce noise and fuzzy boundaries for dataset classification, leading to the overfitting phenomenon. To solve this problem, we propose a new method (FSDR-SMOTE) based on Random-SMOTE and Feature Standard Deviation for rebalancing imbalanced datasets. The method first removes noisy samples based on the Tukey criterion and then calculates the feature standard deviation reflecting the degree of data discretization to detect the sample location, and classifies the samples into boundary samples and safety samples. Secondly, the K-means clustering algorithm is employed to partition the minority class samples into several sub-clusters. Within each sub-cluster, new samples are generated based on random samples, boundary samples, and the corresponding sub-cluster center. The experimental results show that the average evaluation value obtained by FSDR-SMOTE is 93.31% (93.16%, and 86.53%) in terms of the F-measure (G-mean, and MCC) on the 20 benchmark datasets selected from the UCI machine learning library.

Keywords: imbalanced data; feature standard deviation; oversampling strategy; Random-SMOTE (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/11/1709/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/11/1709/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:11:p:1709-:d:1405769

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:12:y:2024:i:11:p:1709-:d:1405769