A partition based method for finding highly correlated pairs
Shuxin Li and
Sheau-Dong Lang
International Journal of Data Mining, Modelling and Management, 2010, vol. 2, issue 4, 334-350
Abstract:
The problem of finding highly correlated pairs is to output all item pairs whose (Pearson) correlation coefficients are greater than a user-specified correlation threshold. Effective discovery of such item pairs is of primary importance in many real data mining applications. Algorithm and Taper algorithm are special cases of our new algorithm with respect to the number of segments. Experimental results on real datasets demonstrate the feasibility and superiority of our algorithm. Recently, the Taper algorithm is developed to discover the set of highly correlated item pairs. In this paper, we present a generalised Taper algorithm to find strongly correlated pairs between items by partitioning the collection of transactions into different segments, so as to achieve better pruning effect and less running time. Consequently, it can be proved that both are naive.
Keywords: correlation; association rules; Pearson correlation coefficients; transactional databases; data mining; partition; highly correlated pairs. (search for similar items in EconPapers)
Date: 2010
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=35562 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:2:y:2010:i:4:p:334-350
Access Statistics for this article
More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().