Multi-Agent Reinforcement Learning with Two-Layer Control Plane for Traffic Engineering

Stepanov, Evgeniy; Smeliansky, Ruslan; Garkavy, Ivan

Multi-Agent Reinforcement Learning with Two-Layer Control Plane for Traffic Engineering

Evgeniy Stepanov (), Ruslan Smeliansky and Ivan Garkavy
Additional contact information
Evgeniy Stepanov: Department of Computing Systems and Automation, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, 119991 Moscow, Russia
Ruslan Smeliansky: Department of Computing Systems and Automation, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, 119991 Moscow, Russia
Ivan Garkavy: Department of Computing Systems and Automation, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, 119991 Moscow, Russia

Mathematics, 2025, vol. 13, issue 19, 1-24

Abstract: The article presents a new method for multi-agent traffic flow balancing. It is based on the MAROH multi-agent optimization method. However, unlike MAROH, the agent’s control plane is built on the principles of human decision-making and consists of two layers. The first layer ensures autonomous decision-making by the agent based on accumulated experience—representatives of states the agent has encountered and knows which actions to take in them. The second layer enables the agent to make decisions for unfamiliar states. A state is considered familiar to the agent if it is close, in terms of a specific metric, to a state the agent has already encountered. The article explores variants of state proximity metrics and various ways to organize the agent’s memory. It has been experimentally shown that an agent with the proposed two-layer control plane SAMAROH-2L outperforms the efficiency of an agent with a single-layer control plane, e.g., makes decisions faster, and inter-agent communication reduction varies from 1% to 80% depending on the selected similarity threshold comparing the method with simultaneous actions SAMAROH and from 80% to 96% comparing to MAROH.

Keywords: traffic engineering; multi-agent reinforcement learning; traffic load balancing (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/19/3180/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/19/3180/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:19:p:3180-:d:1764645

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().