A Deep Reinforcement Learning Scheme for Spectrum Sensing and Resource Allocation in ITS

Wei, Huang; Peng, Yuyang; Yue, Ming; Long, Jiale; AL-Hazemi, Fawaz; Mirza, Mohammad Meraj

A Deep Reinforcement Learning Scheme for Spectrum Sensing and Resource Allocation in ITS

Huang Wei, Yuyang Peng (), Ming Yue, Jiale Long, Fawaz AL-Hazemi and Mohammad Meraj Mirza
Additional contact information
Huang Wei: The School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078, China
Yuyang Peng: The School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078, China
Ming Yue: The School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078, China
Jiale Long: Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen 529020, China
Fawaz AL-Hazemi: Department of Computer and Network Engineering, University of Jeddah, Jeddah 21959, Saudi Arabia
Mohammad Meraj Mirza: Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia

Mathematics, 2023, vol. 11, issue 16, 1-15

Abstract: In recent years, the Internet of Vehicles (IoV) has been found to be of huge potential value in the promotion of the development of intelligent transportation systems (ITSs) and smart cities. However, the traditional scheme in IoV has difficulty in dealing with an uncertain environment, while reinforcement learning has the advantage of being able to deal with an uncertain environment. Spectrum resource allocation in IoV faces the uncertain environment in most cases. Therefore, this paper investigates the spectrum resource allocation problem by deep reinforcement learning after using spectrum sensing technology in the ITS, including the vehicle-to-infrastructure (V2I) link and the vehicle-to-vehicle (V2V) link. The spectrum resource allocation is modeled as a reinforcement learning-based multi-agent problem which is solved by using the soft actor critic (SAC) algorithm. Considered an agent, each V2V link interacts with the vehicle environment and makes a joint action. After that, each agent receives different observations as well as the same reward, and updates networks through the experiences from the memory. Therefore, during a certain time, each V2V link can optimize its spectrum allocation scheme to maximize the V2I capacity as well as increase the V2V payload delivery transmission rate. However, the number of SAC networks increases linearly as the number of V2V links increases, which means that the networks may have a problem in terms of convergence when there are an excessive number of V2V links. Consequently, a new algorithm, namely parameter sharing soft actor critic (PSSAC), is proposed to reduce the complexity for which the model is easier to converge. The simulation results show that both SAC and PSSAC can improve the V2I capacity and increase the V2V payload transmission success probability within a certain time. Specifically, these novel schemes have a 10 percent performance improvement compared with the existing scheme in the vehicular environment. Additionally, PSSAC has a lower complexity.

Keywords: deep reinforcement learning; vehicle to vehicle; vehicle to infrastructure; spectrum resource allocation (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/16/3437/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/16/3437/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:16:p:3437-:d:1212427

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().