EconPapers    
Economics at your fingertips  
 

An Efficient Method for Discretizing Continuous Attributes

Kelley M. Engle and Aryya Gangopadhyay
Additional contact information
Kelley M. Engle: University of Maryland Baltimore County, USA
Aryya Gangopadhyay: University of Maryland Baltimore County, USA

International Journal of Data Warehousing and Mining (IJDWM), 2010, vol. 6, issue 2, 1-21

Abstract: In this paper the authors present a novel method for finding optimal split points for discretization of continuous attributes. Such a method can be used in many data mining techniques for large databases. The method consists of two major steps. In the first step search space is pruned using a bisecting region method that partitions the search space and returns the point with the highest information gain based on its search. The second step consists of a hill climbing algorithm that starts with the point returned by the first step and greedily searches for an optimal point. The methods were tested using fifteen attributes from two data sets. The results show that the method reduces the number of searches drastically while identifying the optimal or near-optimal split points. On average, there was a 98% reduction in the number of information gain calculations with only 4% reduction in information gain.

Date: 2010
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 4018/jdwm.2010040101 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jdwm00:v:6:y:2010:i:2:p:1-21

Access Statistics for this article

International Journal of Data Warehousing and Mining (IJDWM) is currently edited by Eric Pardede

More articles in International Journal of Data Warehousing and Mining (IJDWM) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jdwm00:v:6:y:2010:i:2:p:1-21