Deep Reinforcement Learning-Based Multi-Agent System with Advanced Actor–Critic Framework for Complex Environment
Zihao Cui,
Kailian Deng (),
Hongtao Zhang,
Zhongyi Zha and
Sayed Jobaer
Additional contact information
Zihao Cui: College of Information Science and Technology, Donghua University, Shanghai 201620, China
Kailian Deng: College of Information Science and Technology, Donghua University, Shanghai 201620, China
Hongtao Zhang: College of Information Science and Technology, Donghua University, Shanghai 201620, China
Zhongyi Zha: College of Information Science and Technology, Donghua University, Shanghai 201620, China
Sayed Jobaer: College of Information Science and Technology, Donghua University, Shanghai 201620, China
Mathematics, 2025, vol. 13, issue 5, 1-22
Abstract:
The development of artificial intelligence (AI) game agents that use deep reinforcement learning (DRL) algorithms to process visual information for decision-making has emerged as a key research focus in both academia and industry. However, previous game agents have struggled to execute multiple commands simultaneously in a single decision, failing to accurately replicate the complex control patterns that characterize human gameplay. In this paper, we utilize the ViZDoom environment as the DRL research platform and transform the agent–environment interactions into a Partially Observable Markov Decision Process (POMDP). We introduce an advanced multi-agent deep reinforcement learning (DRL) framework, specifically a Multi-Agent Proximal Policy Optimization (MA-PPO), designed to optimize target acquisition while operating within defined ammunition and time constraints. In MA-PPO, each agent handles distinct parallel tasks with custom reward functions for performance evaluation. The agents make independent decisions while simultaneously executing multiple commands to mimic human-like gameplay behavior. Our evaluation compares MA-PPO against other DRL algorithms, showing a 30.67% performance improvement over the baseline algorithm.
Keywords: deep reinforcement learning; convolution neural network; partially observable Markov decision process; multi-agent system (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/5/754/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/5/754/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:5:p:754-:d:1599535
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().