ROUTE: run‐time robust reducer workload estimation for MapReduce
Zhihong Liu,
Qi Zhang,
Raouf Boutaba,
Yaping Liu and
Zhenghu Gong
International Journal of Network Management, 2016, vol. 26, issue 3, 224-244
Abstract:
MapReduce has become a popular model for large‐scale data processing in recent years. Many works on MapReduce scheduling (e.g., load balancing and deadline‐aware scheduling) have emphasized the importance of predicting workload received by individual reducers. However, because the input characteristics and user‐specified map function of a given job are unknown to the MapReduce framework before the job starts, accurately predicting workload of reducers can be a difficult challenge. To address this challenge, we present ROUTE, a run‐time robust reducer workload estimation technique for MapReduce. ROUTE progressively samples the partition size of the early completed mappers, allowing ROUTE to perform estimation at run time yet fulfilling the accuracy requirement specified by users. Moreover, by using robust estimation and bootstrapping resampling techniques, ROUTE can achieve high applicability to a wide variety of applications. Through experiments using both real and synthetic data on an 11‐node Hadoop cluster, we show ROUTE can achieve high accuracy with error rate no more than 10.92% and an improvement of 40.6% in terms of error rate while compared with the state‐of‐the‐art solution. Besides, through simulations using synthetic data, we show that ROUTE is robust to a variety of skewed distributions. Finally, we apply ROUTE to existing load balancing and deadline‐aware scheduling frameworks and show ROUTE significantly improves the performance of these frameworks. Copyright © 2016 John Wiley & Sons, Ltd.
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/nem.1928
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wly:intnem:v:26:y:2016:i:3:p:224-244
Access Statistics for this article
More articles in International Journal of Network Management from John Wiley & Sons
Bibliographic data for series maintained by Wiley Content Delivery ().