EconPapers    
Economics at your fingertips  
 

Improving Machine Learning Algorithms with CoClust-Based Feature Selection on Big Data: A Comparative Analysis

Zeynep Ilhan Taskin () and Kasirga Yildirak ()
Additional contact information
Zeynep Ilhan Taskin: Eskisehir Osmangazi University
Kasirga Yildirak: Hacettepe University

A chapter in Directional and Multivariate Statistics, 2025, pp 411-439 from Springer

Abstract: Abstract The feature selection stage can be used to create machine learning algorithms, which can lead to better outcomes. The dependency structure between the variables is regarded as the most crucial factor in the feature selection stage. Copula-Based Clustering technique (CoClust), which relies on non-linear dependency and groups only related variables, makes a difference in identifying the dependency structure. In this study, we demonstrate that by combining the Random Forest, AdaBoost, and XGBoost approaches with the CoClust-based feature selection step, it is possible to achieve a notable improvement in CPU times and accuracy. On two different big data sets, we compare CoClust with K-means and hierarchical clustering techniques in order to assess its contribution to algorithms. CPU time, accuracy, and ROC (receiver operating characteristic) curve are used to compare the results.

Keywords: Random forest; AdaBoost; XGBoost; CoClust; Feature selection (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-981-96-2004-3_21

Ordering information: This item can be ordered from
http://www.springer.com/9789819620043

DOI: 10.1007/978-981-96-2004-3_21

Access Statistics for this chapter

More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2026-05-22
Handle: RePEc:spr:sprchp:978-981-96-2004-3_21