5G Multi-Slices Bi-Level Resource Allocation by Reinforcement Learning

Yu, Zhipeng; Gu, Fangqing; Liu, Hailin; Lai, Yutao

5G Multi-Slices Bi-Level Resource Allocation by Reinforcement Learning

Zhipeng Yu, Fangqing Gu (), Hailin Liu and Yutao Lai
Additional contact information
Zhipeng Yu: School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou 510520, China
Fangqing Gu: School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou 510520, China
Hailin Liu: School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou 510520, China
Yutao Lai: School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou 510520, China

Mathematics, 2023, vol. 11, issue 3, 1-20

Abstract: As the centralized unit (CU)—distributed unit (DU) separation in the fifth generation mobile network (5G), the multi-slice and multi-scenario, can be better applied in wireless communication. The development of the 5G network to vertical industries makes its resource allocation also have an obvious hierarchical structure. In this paper, we propose a bi-level resource allocation model. The up-level objective in this model refers to the profit of the 5G operator through the base station allocating resources to slices. The lower-level objective in this model refers to the slices allocating the resource to its users fairly. The resource allocation problem is a complex optimization problem with mixed-discrete variables, so whether a resource allocation algorithm can quickly and accurately give the resource allocation scheme is the key to its practical application. According to the characteristics of the problem, we select the multi-agent twin delayed deep deterministic policy gradient (MATD3) to solve the upper slice resource allocation and the discrete and continuous twin delayed deep deterministic policy gradient (DCTD3) to solve the lower user resource allocation. It is crucial to accurately characterize the state, environment, and reward of reinforcement learning for solving practical problems. Thus, we provide an effective definition of the environment, state, action, and reward of MATD3 and DCTD3 for solving the bi-level resource allocation problem. We conduct some simulation experiments and compare it with the multi-agent deep deterministic policy gradient (MADDPG) algorithm and nested bi-level evolutionary algorithm (NBLEA). The experimental results show that the proposed algorithm can quickly provide a better resource allocation scheme.

Keywords: bi-level optimization; multi-slice; resource allocation; reinforcement learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/3/760/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/3/760/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:3:p:760-:d:1055456

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().