HYBRIDJOIN for Near-Real-Time Data Warehousing
M. Asif Naeem,
Gillian Dobbie and
Gerald Weber
Additional contact information
M. Asif Naeem: The University of Auckland, New Zealand
Gillian Dobbie: The University of Auckland, New Zealand
Gerald Weber: The University of Auckland, New Zealand
International Journal of Data Warehousing and Mining (IJDWM), 2011, vol. 7, issue 4, 21-42
Abstract:
An important component of near-real-time data warehouses is the near-real-time integration layer. One important element in near-real-time data integration is the join of a continuous input data stream with a disk-based relation. For high-throughput streams, stream-based algorithms, such as Mesh Join (MESHJOIN), can be used. However, in MESHJOIN the performance of the algorithm is inversely proportional to the size of disk-based relation. The Index Nested Loop Join (INLJ) can be set up so that it processes stream input, and can deal with intermittences in the update stream but it has low throughput. This paper introduces a robust stream-based join algorithm called Hybrid Join (HYBRIDJOIN), which combines the two approaches. A theoretical result shows that HYBRIDJOIN is asymptotically as fast as the fastest of both algorithms. The authors present performance measurements of the implementation. In experiments using synthetic data based on a Zipfian distribution, HYBRIDJOIN performs significantly better for typical parameters of the Zipfian distribution, and in general performs in accordance with the theoretical model while the other two algorithms are unacceptably slow under different settings.
Date: 2011
References: Add references at CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 4018/jdwm.2011100102 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jdwm00:v:7:y:2011:i:4:p:21-42
Access Statistics for this article
International Journal of Data Warehousing and Mining (IJDWM) is currently edited by Eric Pardede
More articles in International Journal of Data Warehousing and Mining (IJDWM) from IGI Global
Bibliographic data for series maintained by Journal Editor ().