Machine Learning for Data Center Optimizations: Feature Selection Using Shapley Additive exPlanation (SHAP)
Yibrah Gebreyesus,
Damian Dalton,
Sebastian Nixon,
Davide De Chiara and
Marta Chinnici ()
Additional contact information
Yibrah Gebreyesus: School of Computer Science, University College of Dublin, D04 V1W8 Dublin, Ireland
Damian Dalton: School of Computer Science, University College of Dublin, D04 V1W8 Dublin, Ireland
Sebastian Nixon: School of Computer Science, Wolaita Sodo University, Wolaita P.O. Box 138, Ethiopia
Davide De Chiara: ENEA-R.C. Portici, 80055 Portici (NA), Italy
Marta Chinnici: ENEA-R.C. Casaccia, 00196 Rome, Italy
Future Internet, 2023, vol. 15, issue 3, 1-17
Abstract:
The need for artificial intelligence (AI) and machine learning (ML) models to optimize data center (DC) operations increases as the volume of operations management data upsurges tremendously. These strategies can assist operators in better understanding their DC operations and help them make informed decisions upfront to maintain service reliability and availability. The strategies include developing models that optimize energy efficiency, identifying inefficient resource utilization and scheduling policies, and predicting outages. In addition to model hyperparameter tuning, feature subset selection (FSS) is critical for identifying relevant features for effectively modeling DC operations to provide insight into the data, optimize model performance, and reduce computational expenses. Hence, this paper introduces the Shapley Additive exPlanation (SHAP) values method, a class of additive feature attribution values for identifying relevant features that is rarely discussed in the literature. We compared its effectiveness with several commonly used, importance-based feature selection methods. The methods were tested on real DC operations data streams obtained from the ENEA CRESCO6 cluster with 20,832 cores. To demonstrate the effectiveness of SHAP compared to other methods, we selected the top ten most important features from each method, retrained the predictive models, and evaluated their performance using the MAE, RMSE, and MPAE evaluation criteria. The results presented in this paper demonstrate that the predictive models trained using features selected with the SHAP-assisted method performed well, with a lower error and a reasonable execution time compared to other methods.
Keywords: data center; artificial intelligence; machine learning; feature selection; SHAP; game theory (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/1999-5903/15/3/88/pdf (application/pdf)
https://www.mdpi.com/1999-5903/15/3/88/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:15:y:2023:i:3:p:88-:d:1076144
Access Statistics for this article
Future Internet is currently edited by Ms. Grace You
More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().