A clustering-based discretization for supervised learning

Gupta, Ankit; Mehrotra, Kishan G.; Mohan, Chilukuri

A clustering-based discretization for supervised learning

Ankit Gupta, Kishan G. Mehrotra and Chilukuri Mohan

Statistics & Probability Letters, 2010, vol. 80, issue 9-10, 816-824

Abstract: We address the problem of discretization of continuous variables for machine learning classification algorithms. Existing procedures do not use interdependence between the variables towards this goal. Our proposed method uses clustering to exploit such interdependence. Numerical results show that this improves the classification performance in almost all cases. Even if an existing algorithm can successfully operate with continuous variables, better performance is obtained if the variables are first discretized. An additional advantage of discretization is that it reduces the overall computation time.

Keywords: Discretization; Clustering; Binning; Supervised; learning (search for similar items in EconPapers)
Date: 2010
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167-7152(10)00027-1
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:stapro:v:80:y:2010:i:9-10:p:816-824

Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/supportfaq.cws_home/regional
https://shop.elsevie ... _01_ooc_1&version=01

Access Statistics for this article

Statistics & Probability Letters is currently edited by Somnath Datta and Hira L. Koul

More articles in Statistics & Probability Letters from Elsevier
Bibliographic data for series maintained by Catherine Liu ().