A Pareto Model for OLAP View Size Estimation
Thomas P. Nadeau () and
Toby J. Teorey ()
Additional contact information
Thomas P. Nadeau: The University of Michigan
Information Systems Frontiers, 2003, vol. 5, issue 2, No 3, 137-147
Abstract:
Abstract On-Line Analytical Processing (OLAP) aims at gaining useful information quickly from large amounts of data residing in a data warehouse. To improve the quickness of response to queries, pre-aggregation is a useful strategy. However, it is usually impossible to pre-aggregate along all combinations of the dimensions. The multi-dimensional aspects of the data lead to combinatorial explosion in the number and potential storage size of the aggregates. We must selectively pre-aggregate. Cost/benefit analysis involves estimating the storage requirements of the aggregates in question. We present an original algorithm for estimating the number of rows in an aggregate based on the Pareto distribution model. We test the Pareto Model Algorithm empirically against four published algorithms, and conclude the Pareto Model Algorithm is consistently the best of these algorithms for estimating view size.
Keywords: Pareto distribution; OLAP; view size estimation; materialized view selection (search for similar items in EconPapers)
Date: 2003
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://link.springer.com/10.1023/A:1022693305401 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:infosf:v:5:y:2003:i:2:d:10.1023_a:1022693305401
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10796
DOI: 10.1023/A:1022693305401
Access Statistics for this article
Information Systems Frontiers is currently edited by Ram Ramesh and Raghav Rao
More articles in Information Systems Frontiers from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().