NUDIF: A Non-Uniform Deployment Framework for Distributed Inference in Heterogeneous Edge Clusters
Peng Li (),
Chen Qing and
Hao Liu
Additional contact information
Peng Li: National Key Laboratory of Complex Aviation System Simulation, Chengdu 610036, China
Chen Qing: National Key Laboratory of Complex Aviation System Simulation, Chengdu 610036, China
Hao Liu: School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, China
Future Internet, 2025, vol. 17, issue 4, 1-14
Abstract:
Distributed inference in resource-constrained heterogeneous edge clusters is fundamentally limited by disparities in device capabilities and load imbalance issues. Existing methods predominantly focus on optimizing single-pipeline allocation schemes for partitioned sub-models. However, such approaches often lead to load imbalance and suboptimal resource utilization under concurrent batch processing scenarios. To address these challenges, we propose a non-uniform deployment inference framework (NUDIF), which achieves high-throughput distributed inference service by adapting to heterogeneous resources and balancing inter-stage processing capabilities. Formulated as a mixed-integer nonlinear programming (MINLP) problem, NUDIF is responsible for planning the number of instances for each sub-model and determining the specific devices for deploying these instances, while considering computational capacity, memory constraints, and communication latency. This optimization minimizes inter-stage processing discrepancies and maximizes resource utilization. Experimental evaluations demonstrate that NUDIF enhances system throughput by an average of 9.95% compared to traditional single-pipeline optimization methods under various scales of cluster device configurations.
Keywords: distributed inference; heterogeneous computing clusters; deep neural networks; optimization methods (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1999-5903/17/4/168/pdf (application/pdf)
https://www.mdpi.com/1999-5903/17/4/168/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:17:y:2025:i:4:p:168-:d:1632675
Access Statistics for this article
Future Internet is currently edited by Ms. Grace You
More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().