Evaluating DL Model Scaling Trade-Offs During Inference via an Empirical Benchmark Analysis
Demetris Trihinas (),
Panagiotis Michael and
Moysis Symeonides
Additional contact information
Demetris Trihinas: Department of Computer Science, School of Sciences and Engineering, University of Nicosia, Nicosia CY-2417, Cyprus
Panagiotis Michael: Department of Computer Science, School of Sciences and Engineering, University of Nicosia, Nicosia CY-2417, Cyprus
Moysis Symeonides: Department of Computer Science, University of Cyprus, Nicosia CY-2109, Cyprus
Future Internet, 2024, vol. 16, issue 12, 1-16
Abstract:
With generative Artificial Intelligence (AI) capturing public attention, the appetite of the technology sector for larger and more complex Deep Learning (DL) models is continuously growing. Traditionally, the focus in DL model development has been on scaling the neural network’s foundational structure to increase computational complexity and enhance the representational expressiveness of the model. However, with recent advancements in edge computing and 5G networks, DL models are now aggressively being deployed and utilized across the cloud–edge–IoT continuum for the realization of in situ intelligent IoT services. This paradigm shift introduces a growing need for AI practitioners, as a focus on inference costs, including latency, computational overhead, and energy efficiency, is long overdue. This work presents a benchmarking framework designed to assess DL model scaling across three key performance axes during model inference: classification accuracy, computational overhead, and latency. The framework’s utility is demonstrated through an empirical study involving various model structures and variants, as well as publicly available datasets for three popular DL use cases covering natural language understanding, object detection, and regression analysis.
Keywords: deep learning; artificial intelligence; cloud computing; benchmarking (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/1999-5903/16/12/468/pdf (application/pdf)
https://www.mdpi.com/1999-5903/16/12/468/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:16:y:2024:i:12:p:468-:d:1543624
Access Statistics for this article
Future Internet is currently edited by Ms. Grace You
More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().