RS-MADDPG: Routing Strategy Based on Multi-Agent Deep Deterministic Policy Gradient for Differentiated QoS Services

Kuang, Shi; Zheng, Jinyu; Liang, Shilin; Li, Yingying; Liang, Siyuan; Huang, Wanwei

RS-MADDPG: Routing Strategy Based on Multi-Agent Deep Deterministic Policy Gradient for Differentiated QoS Services

Shi Kuang, Jinyu Zheng, Shilin Liang, Yingying Li (), Siyuan Liang and Wanwei Huang
Additional contact information
Shi Kuang: Transmission Operation and Inspection Center, State Grid Zhengzhou Electric Power Supply Company, Zhengzhou 450007, China
Jinyu Zheng: DC Branch, State Grid Henan Electric Power Company, Zhengzhou 450052, China
Shilin Liang: College of Big Data and Artificial Intelligence, Zhengzhou University of Economics and Business, Zhengzhou 450099, China
Yingying Li: College of Electronics and Communication Engineering, Shenzhen Polytechnic University, Shenzhen 518005, China
Siyuan Liang: College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450007, China
Wanwei Huang: College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450007, China

Future Internet, 2025, vol. 17, issue 9, 1-20

Abstract: As network environments become increasingly dynamic and users’ Quality of Service (QoS) demands grow more diverse, efficient and adaptive routing strategies are urgently needed. However, traditional routing strategies suffer from limitations such as poor adaptability to fluctuating traffic, lack of differentiated service handling, and slow convergence in complex network scenarios. To this end, we propose a routing strategy based on multi-agent deep deterministic policy gradient for differentiated QoS services (RS-MADDPG) in a software-defined networking (SDN) environment. First, network state information is collected in real time and transmitted to the control layer for processing. Then, the processed information is forwarded to the intelligent layer. In this layer, multiple agents cooperate during training to learn routing policies that adapt to dynamic network conditions. Finally, the learned policies enable agents to perform adaptive routing decisions that explicitly address differentiated QoS requirements by incorporating a custom reward structure that dynamically balances throughput, delay, and packet loss according to traffic type. Simulation results demonstrate that RS-MADDPG achieves convergence approximately 30 training cycles earlier than baseline methods, while improving average throughput by 3%, reducing latency by 7%, and lowering packet loss rate by 2%.

Keywords: quality of service; routing strategy; multi-agent deep deterministic policy gradient; software-defined networking (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1999-5903/17/9/393/pdf (application/pdf)
https://www.mdpi.com/1999-5903/17/9/393/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:17:y:2025:i:9:p:393-:d:1737497

Access Statistics for this article

Future Internet is currently edited by Ms. Grace You

More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().