Resolving the Classic Resource Allocation Conflict in On-Ramp Merging: A Regionally Coordinated Nash-Advantage Decomposition Deep Q-Network Approach for Connected and Automated Vehicles
Linning Li and
Lili Lu ()
Additional contact information
Linning Li: Faculty of Maritime and Transportation, Ningbo University, Ningbo 315211, China
Lili Lu: Faculty of Maritime and Transportation, Ningbo University, Ningbo 315211, China
Sustainability, 2025, vol. 17, issue 17, 1-27
Abstract:
To improve the traffic efficiency of connected and automated vehicles (CAVs) in on-ramp merging areas, this study proposes a novel region-level multi-agent reinforcement learning framework, Regionally Coordinated Nash-Advantage Decomposition Deep Q-Network with Conflict-Aware Q Fusion (RC-NashAD-DQN). Unlike existing vehicle-level control methods, which suffer from high computational overhead and poor scalability, our approach abstracts on-ramp and main road areas as region-level control agents, achieving coordinated yet independent decision-making while maintaining control precision and merging efficiency comparable to fine-grained vehicle-level approaches. Each agent adopts a value–advantage decomposition architecture to enhance policy stability and distinguish action values, while sharing state–action information to improve inter-agent awareness. A Nash equilibrium solver is applied to derive joint strategies, and a conflict-aware Q-fusion mechanism is introduced as a regularization term rather than a direct action-selection tool, enabling the system to resolve local conflicts—particularly at region boundaries—without compromising global coordination. This design reduces training complexity, accelerates convergence, and improves robustness against communication imperfections. The framework is evaluated using the SUMO simulator at the Taishan Road interchange on the S1 Yongtaiwen Expressway under heterogeneous traffic conditions involving both passenger cars and container trucks, and is compared with baseline models including C-DRL-VSL and MADDPG. Extensive simulations demonstrate that RC-NashAD-DQN significantly improves average traffic speed by 17.07% and reduces average delay by 12.68 s, outperforming all baselines in efficiency metrics while maintaining robust convergence performance. These improvements enhance cooperation and merging efficiency among vehicles, contributing to sustainable urban mobility and the advancement of intelligent transportation systems.
Keywords: Nash equilibrium; deep Q-learning; Q-value fusion; collaboration and competition; SUMO simulation; connected and autonomous vehicles; sustainable (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2071-1050/17/17/7826/pdf (application/pdf)
https://www.mdpi.com/2071-1050/17/17/7826/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:17:y:2025:i:17:p:7826-:d:1738189
Access Statistics for this article
Sustainability is currently edited by Ms. Alexandra Wu
More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().