EconPapers    
Economics at your fingertips  
 

MIL: a data discretisation approach

Bikash Kanti Sarkar, Shib Sankar Sana and Kripasindhu Chaudhuri

International Journal of Data Mining, Modelling and Management, 2011, vol. 3, issue 3, 303-318

Abstract: Data discretisation is an important step in the process of machine learning, since it is easier for classifiers to deal with discrete attributes rather than continuous attributes. Over the years, several methods of performing discretisation such as Boolean reasoning, equal frequency binning, entropy have been proposed, explored, and implemented. In this article, a simple supervised discretisation approach called minimum information loss (MIL) is introduced. The prime goal of MIL is to maximise classification accuracy of classifier, minimising loss of information while discretisation of continuous attributes. The performance of the suggested approach is compared with the supervised discretisation algorithms: selective pseudo iterative deletion 4.7 (SPID4.7) and minimum description length principle (MDLP), using four state-of-the-art rule inductive algorithms – neural network, C4.5, Naive-Bayes, and CN2. The empirical results show that the presented approach performs better in several cases in comparison to the other two algorithms.

Keywords: data mining; data discretisation; classifiers; accuracy; information loss; machine learning. (search for similar items in EconPapers)
Date: 2011
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.inderscience.com/link.php?id=41811 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:3:y:2011:i:3:p:303-318

Access Statistics for this article

More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().

 
Page updated 2025-03-19
Handle: RePEc:ids:ijdmmm:v:3:y:2011:i:3:p:303-318