Feature selection in accident data: an analysis of its application in classification algorithms
Amrita Sarkar,
G. Sahoo and
U.C. Sahoo
International Journal of Data Analysis Techniques and Strategies, 2016, vol. 8, issue 2, 108-121
Abstract:
Feature selection is aimed to select a reducing number of subset features with high predictive information and remove irrelevant features with minimal predictive information. In this paper, we propose an ensemble approach for selecting features, using multiple feature selection techniques and combining the same to yield more robust and stable results. Multiple feature ranking techniques assemblage is performed in two steps. The first step necessitates creating a set of different feature selectors while the second step combines the results of all feature ranking techniques. The application of this method has been tested using accident dataset to increase predictive performance of accident in Kolkata. After the feature selection methods, this paper also explains significance of data mining classification algorithms to build classification models on the accident datasets with various selected subset of features. Further, the classification models are assessed in terms of the AUC performance metric.
Keywords: feature selection; feature ranking; classification algorithms; accident data analysis; ensemble ranking; India; Kolkata; road traffic accidents. (search for similar items in EconPapers)
Date: 2016
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=77484 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:injdan:v:8:y:2016:i:2:p:108-121
Access Statistics for this article
More articles in International Journal of Data Analysis Techniques and Strategies from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().