Collaborative multi-agents in dynamic industrial internet of things using deep reinforcement learning

Raza, Ali; Shah, Munam Ali; Khattak, Hasan Ali; Maple, Carsten; Al-Turjman, Fadi; Rauf, Hafiz Tayyab

Collaborative multi-agents in dynamic industrial internet of things using deep reinforcement learning

Ali Raza (), Munam Ali Shah (), Hasan Ali Khattak (), Carsten Maple (), Fadi Al-Turjman () and Hafiz Tayyab Rauf ()
Additional contact information
Ali Raza: COMSATS University Islamabad
Munam Ali Shah: COMSATS University Islamabad
Hasan Ali Khattak: National University of Sciences and Technology (NUST)
Carsten Maple: University of Warwick
Fadi Al-Turjman: Near East University
Hafiz Tayyab Rauf: University of BRADFORD

Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, 2022, vol. 24, issue 7, No 22, 9499 pages

Abstract: Abstract Sustainable cities are envisioned to have economic and industrial steps toward reducing pollution. Many real-world applications such as autonomous vehicles, transportation, traffic signals, and industrial automation can now be trained using deep reinforcement learning (DRL) techniques. These applications are designed to take benefit of DRL in order to improve the monitoring as well as measurements in industrial internet of things for automation identification system. The complexity of these environments means that it is more appropriate to use multi-agent systems rather than a single-agent. However, in non-stationary environments multi-agent systems can suffer from increased number of observations, limiting the scalability of algorithms. This study proposes a model to tackle the problem of scalability in DRL algorithms in transportation domain. A partition-based approach is used in the proposed model to reduce the complexity of the environment. This partition-based approach helps agents to stay in their working area. This reduces the complexity of the learning environment and the number of observations for each agent. The proposed model uses generative adversarial imitation learning and behavior cloning, combined with a proximal policy optimization algorithm, for training multiple agents in a dynamic environment. We present a comparison of PPO, soft actor-critic, and our model in reward gathering. Our simulation results show that our model outperforms SAC and PPO in cumulative reward gathering and dramatically improved training multiple agents.

Keywords: Deep reinforcement learning; Multi-agents; Behavior cloning; Dynamic environment; Scalability (search for similar items in EconPapers)
Date: 2022
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10668-021-01836-9 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:endesu:v:24:y:2022:i:7:d:10.1007_s10668-021-01836-9

Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/10668

DOI: 10.1007/s10668-021-01836-9

Access Statistics for this article

Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development is currently edited by Luc Hens

More articles in Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().