Analysis of Frequently Failing Tasks and Rescheduling Strategy in the Cloud System
Hongyan Tang,
Ying Li,
Tong Jia,
Xiaoyong Yuan and
Zhonghai Wu
Additional contact information
Hongyan Tang: School of Software and Microelectronics, Peking University, Beijing, China
Ying Li: National Engineering Center of Software Engineering, Peking University, Beijing, China
Tong Jia: School of Software and Microelectronics, Peking University, Beijing, China
Xiaoyong Yuan: Department of Computer and Information Science and Engineering, University of Florida, Florida, USA
Zhonghai Wu: National Engineering Center of Software Engineering, Peking University, Beijing, China
International Journal of Distributed Systems and Technologies (IJDST), 2018, vol. 9, issue 1, 16-38
Abstract:
To better understand task failures in cloud computing systems, the authors analyze failure frequency of tasks based on Google cluster dataset, and find some frequently failing tasks that suffer from long-term failures and repeated rescheduling, which are called killer tasks as they can be a big concern of cloud systems. Hence there is a need to analyze killer tasks thoroughly and recognize them precisely. In this article, the authors first investigate resource usage pattern of killer tasks and analyze rescheduling strategies of killer tasks in Google cluster to find that repeated rescheduling causes large amount of resource wasting. Based on the above observations, they then propose an online killer task recognition service to recognize killer tasks at the very early stage of their occurrence so as to avoid unnecessary resource wasting. The experiment results show that the proposed service performs a 93.6% accuracy in recognizing killer tasks with an 87% timing advance and 86.6% resource saving for the cloud system averagely.
Date: 2018
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJDST.2018010102 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jdst00:v:9:y:2018:i:1:p:16-38
Access Statistics for this article
International Journal of Distributed Systems and Technologies (IJDST) is currently edited by Nik Bessis
More articles in International Journal of Distributed Systems and Technologies (IJDST) from IGI Global
Bibliographic data for series maintained by Journal Editor ().