Synthesizing High-Utility Patterns from Different Data Sources
Abhinav Muley and
Manish Gudadhe
Additional contact information
Abhinav Muley: Department of Computer Engineering, St. Vincent Pallotti College of Engineering & Technology, Nagpur 441108, India
Manish Gudadhe: Department of Computer Engineering, St. Vincent Pallotti College of Engineering & Technology, Nagpur 441108, India
Data, 2018, vol. 3, issue 3, 1-16
Abstract:
In large organizations, it is often required to collect data from the different geographic branches spread over different locations. Extensive amounts of data may be gathered at the centralized location in order to generate interesting patterns via mono-mining the amassed database. However, it is feasible to mine the useful patterns at the data source itself and forward only these patterns to the centralized company, rather than the entire original database. These patterns also exist in huge numbers, and different sources calculate different utility values for each pattern. This paper proposes a weighted model for aggregating the high-utility patterns from different data sources. The procedure of pattern selection was also proposed to efficiently extract high-utility patterns in our weighted model by discarding low-utility patterns. Meanwhile, the synthesizing model yielded high-utility patterns, unlike association rule mining, in which frequent itemsets are generated by considering each item with equal utility, which is not true in real life applications such as sales transactions. Extensive experiments performed on the datasets with varied characteristics show that the proposed algorithm will be effective for mining very sparse and sparse databases with a huge number of transactions. Our proposed model also outperforms various state-of-the-art distributed models of mining in terms of running time.
Keywords: data integration; data mining; high-utility patterns; knowledge discovery; weighted model; multi-database mining; distributed data mining (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2018
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2306-5729/3/3/32/pdf (application/pdf)
https://www.mdpi.com/2306-5729/3/3/32/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:3:y:2018:i:3:p:32-:d:167436
Access Statistics for this article
Data is currently edited by Ms. Cecilia Yang
More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().