Elasticity in Cloud Databases and Their Query Processing

Graefe, Goetz; Nica, Anisoara; Stolze, Knut; Neumann, Thomas; Eavis, Todd; Petrov, Ilia; Pourabbas, Elaheh; Fekete, David

Elasticity in Cloud Databases and Their Query Processing

Goetz Graefe, Anisoara Nica, Knut Stolze, Thomas Neumann, Todd Eavis, Ilia Petrov, Elaheh Pourabbas and David Fekete
Additional contact information
Goetz Graefe: Research in Business Intelligence, Hewlett-Packard Laboratories, Palo Alto, CA, USA
Anisoara Nica: SQL Anywhere Research and Development, Sybase (An SAP Company), Waterloo, ON, Canada
Knut Stolze: Information Management Department, IBM Germany Research & Development, Böblingen, Germany
Thomas Neumann: Technische Universität München, Garching, Germany
Todd Eavis: Department of Computer Science and Software Engineering, Concordia University, Montreal, QC, Canada
Ilia Petrov: Data Management Lab, School of Informatics, Reutlingen University, Germany
Elaheh Pourabbas: Institute of Systems Analysis and Computer Science “Antonio Ruberti”, National Research Council, Rome, Italy
David Fekete: Department of Information Systems, Universität Münster, Münster, Germany

International Journal of Data Warehousing and Mining (IJDWM), 2013, vol. 9, issue 2, 1-20

Abstract: A central promise of cloud services is elastic, on-demand provisioning. The provisioning of data on temporarily available nodes is what makes elastic database services a hard problem. The essential task that enables elastic data services is bringing a node and its data up-to-date. Strategies for high availability do not satisfy the need in this context because they bring nodes online and up-to-date by repeating history, e.g., by log shipping. Nodes must become up-to-date and useful for query processing incrementally by key range. What is wanted is a technique such that in a newly added node, during each short period of time, an additional small key range becomes up-to-date, until eventually the entire dataset becomes up-to-date and useful for query processing, with overall update performance comparable to a traditional high-availability strategy that carries the entire dataset forward without regard to key ranges. Even without the entire dataset being available, the node is productive and participates in query processing tasks. The authors’ proposed solution relies on techniques from partitioned B-trees, adaptive merging, deferred maintenance of secondary indexes and of materialized views, and query optimization using materialized views. The paper introduces a family of maintenance strategies for temporarily available copies, the space of possible query execution plans and their cost functions, as well as appropriate query optimization techniques.

Date: 2013
References: Add references at CitEc
Citations:

Downloads: (external link)
https://services.igi-global.com/resolvedoi/resolve ... 4018/jdwm.2013040101 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jdwm00:v:9:y:2013:i:2:p:1-20

Access Statistics for this article

International Journal of Data Warehousing and Mining (IJDWM) is currently edited by Eric Pardede

More articles in International Journal of Data Warehousing and Mining (IJDWM) from IGI Global Scientific Publishing
Bibliographic data for series maintained by Journal Editor ().