Economics at your fingertips  

Number of Instances for Reliable Feature Ranking in a Given Problem

Bohanec Marko (), Borštnar Mirjana Kljajić () and Robnik-Šikonja Marko ()
Additional contact information
Bohanec Marko: Salvirt Ltd.,Ljubljana, Slovenia
Borštnar Mirjana Kljajić: Faculty of Organizational Sciences, University of Maribor,Kranj, Slovenia
Robnik-Šikonja Marko: Faculty of Computer and Information Science, University of Ljubljana,Ljubljana, Slovenia

Business Systems Research, 2018, vol. 9, issue 2, 35-44

Abstract: Background: In practical use of machine learning models, users may add new features to an existing classification model, reflecting their (changed) empirical understanding of a field. New features potentially increase classification accuracy of the model or improve its interpretability. Objectives: We have introduced a guideline for determination of the sample size needed to reliably estimate the impact of a new feature. Methods/Approach: Our approach is based on the feature evaluation measure ReliefF and the bootstrap-based estimation of confidence intervals for feature ranks. Results: We test our approach using real world qualitative business-tobusiness sales forecasting data and two UCI data sets, one with missing values. The results show that new features with a high or a low rank can be detected using a relatively small number of instances, but features ranked near the border of useful features need larger samples to determine their impact. Conclusions: A combination of the feature evaluation measure ReliefF and the bootstrap-based estimation of confidence intervals can be used to reliably estimate the impact of a new feature in a given problem

Keywords: machine learning; feature ranking; feature evaluation (search for similar items in EconPapers)
Date: 2018
References: Add references at CitEc
Citations Track citations by RSS feed

Downloads: (external link) ... -0017.xml?format=INT (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link:

Access Statistics for this article

Business Systems Research is currently edited by Mirjana Pejić Bach

More articles in Business Systems Research from Sciendo
Bibliographic data for series maintained by Peter Golla ().

Page updated 2018-09-04
Handle: RePEc:bit:bsrysr:v:9:y:2018:i:2:p:35-44:n:4