HUPT-mine: an efficient algorithm for high utility pattern mining
Ramaraju Chithra and
Savarimuthu Nickolas
International Journal of Business and Systems Research, 2012, vol. 6, issue 3, 279-295
Abstract:
In recent years, the problem of high utility pattern mining becomes one of the most important research areas in data mining. High utility pattern mining extracts patterns which have utility value higher than or equal to user specified minimum utility. The problem is challenging, because of the nonapplicability of anti-monotone property of frequent pattern mining. The existing high utility pattern mining algorithm adopts level wise candidate generation and many recently proposed approaches also generate large number of candidate itemsets. In this paper, a novel high utility pattern tree (HUPT) is proposed by applying two pruning strategies to reduce number of candidate itemsets by scanning database twice. For each conditional pattern base, a local tree is constructed with required information to generate candidate itemsets, by employing pattern growth approach. The experimental results on different datasets show that it reduces the number of candidate itemsets and also outperforms two-phase algorithm for dense datasets with long transactions.
Keywords: high utility pattern mining; two-phase algorithms; high utility pattern trees; HUPT; HUP growth; data mining; pruning strategies. (search for similar items in EconPapers)
Date: 2012
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=47927 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijbsre:v:6:y:2012:i:3:p:279-295
Access Statistics for this article
More articles in International Journal of Business and Systems Research from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().