Policy Optimization of the Power Allocation Algorithm Based on the Actor–Critic Framework in Small Cell Networks

Chen, Haibo; Huang, Zhongwei; Zhao, Xiaorong; Liu, Xiao; Jiang, Youjun; Geng, Pinyong; Yang, Guang; Cao, Yewen; Wang, Deqiang

Policy Optimization of the Power Allocation Algorithm Based on the Actor–Critic Framework in Small Cell Networks

Haibo Chen, Zhongwei Huang, Xiaorong Zhao, Xiao Liu, Youjun Jiang, Pinyong Geng, Guang Yang, Yewen Cao () and Deqiang Wang ()
Additional contact information
Haibo Chen: School of Information Science and Engineering, Shandong University, Qingdao 266237, China
Zhongwei Huang: School of Information Science and Engineering, Shandong University, Qingdao 266237, China
Xiaorong Zhao: School of Information Science and Engineering, Shandong University, Qingdao 266237, China
Xiao Liu: School of Information Science and Engineering, Shandong University, Qingdao 266237, China
Youjun Jiang: School of Information Science and Engineering, Shandong University, Qingdao 266237, China
Pinyong Geng: School of Information Science and Engineering, Shandong University, Qingdao 266237, China
Guang Yang: College of Electronic and Information Engineering, Shandong University of Science and Technology, Qingdao 266590, China
Yewen Cao: School of Information Science and Engineering, Shandong University, Qingdao 266237, China
Deqiang Wang: School of Information Science and Engineering, Shandong University, Qingdao 266237, China

Mathematics, 2023, vol. 11, issue 7, 1-12

Abstract: A practical solution to the power allocation problem in ultra-dense small cell networks can be achieved by using deep reinforcement learning (DRL) methods. Unlike traditional algorithms, DRL methods are capable of achieving low latency and operating without the need for global real-time channel state information (CSI). Based on the actor–critic framework, we propose a policy optimization of the power allocation algorithm (POPA) for small cell networks in this paper. The POPA adopts the proximal policy optimization (PPO) algorithm to update the policy, which has been shown to have stable exploration and convergence effects in our simulations. Thanks to our proposed actor–critic architecture with distributed execution and centralized exploration training, the POPA can meet real-time requirements and has multi-dimensional scalability. Through simulations, we demonstrate that the POPA outperforms existing methods in terms of spectral efficiency. Our findings suggest that the POPA can be of practical value for power allocation in small cell networks.

Keywords: power allocation; deep reinforcement learning; actor–critic (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/7/1702/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/7/1702/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:7:p:1702-:d:1114118

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().