EconPapers    
Economics at your fingertips  
 

Intelligent Software Engineering for Reliable Cloud Operations

Michael R. Lyu () and Yuxin Su ()
Additional contact information
Michael R. Lyu: The Chinese University of Hong Kong
Yuxin Su: Sun Yat-sen University

A chapter in System Dependability and Analytics, 2023, pp 7-37 from Springer

Abstract: Abstract Reliable Cloud operations are vital to our daily lives because many popular modern software systems are deployed in cloud systems. In this chapter, we discuss our experience in developing an AIOps (Artificial Intelligence for IT Operations) framework to improve the reliability of large-scale cloud systems with intelligence software engineering techniques. The comprehensive AIOps framework includes anomaly detection of key performance indicators, service dependency mining for failure diagnosis, and system incident aggregation for root cause analysis from various information sources like meter data, topology, alert, and incident tickets. We also conduct extensive experiments with production data collected from large-scale Huawei Cloud systems to demonstrate the effectiveness of intelligent software engineering techniques for reliable cloud operations.

Date: 2023
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:ssrchp:978-3-031-02063-6_2

Ordering information: This item can be ordered from
http://www.springer.com/9783031020636

DOI: 10.1007/978-3-031-02063-6_2

Access Statistics for this chapter

More chapters in Springer Series in Reliability Engineering from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-20
Handle: RePEc:spr:ssrchp:978-3-031-02063-6_2