EconPapers    
Economics at your fingertips  
 

Random subspace-based ensemble classifier for high-dimensional data Using SPARK

Venkaiah Chowdary Bhimineni and Rajiv Senapati

PLOS ONE, 2026, vol. 21, issue 3, 1-26

Abstract: High-dimensional data classification remains challenging for machine learning models due to sparsity and overfitting caused by the ‘curse of dimensionality‘. As the number of features increases, data points become sparse, hindering generalization in classification and leading to higher computational costs and reduced accuracy. To address these issues, we propose an ensemble classifier based on random subspaces implemented in the Spark framework. The proposed framework comprises three key stages. First, the high-dimensional data is normalised through min-max normalisation. Second, the master node partitions the data by using improved deep fuzzy clustering (IDFC). In contrast, the slave node applies support vector machine-modified recursive feature elimination (SVM-MRFE) for efficient feature selection, followed by feature fusion. Finally, we introduced an improved subspace-based ensemble classifier (ISSBEC) that comprises a feature-fusion-based random subspace (FF-RSS), mixed-space enhancement (MSE), and multiple base classifiers. The efficacy of the ISSBEC classifier was evaluated using a set of performance metrics and compared with state-of-the-art methods. Experimental results demonstrate that the proposed approach improves both accuracy and robustness, offering a scalable solution to the limitations of high-dimensional datasets.

Date: 2026
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0342408 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 42408&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0342408

DOI: 10.1371/journal.pone.0342408

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2026-03-15
Handle: RePEc:plo:pone00:0342408