Deep Reinforcement Learning for 5G Radio Access Network Slicing with
Spectrum Coexistence
Abstract
The paper presents a reinforcement learning solution to dynamic
admission control and resource allocation for 5G radio access network
(RAN) slicing requests, when the spectrum is potentially shared between
5G and an incumbent user such as in the Citizens Broadband Radio Service
scenarios. Available communication resources (frequency-time resource
blocks and transmit powers) and computational resources (processor
power) not used by the incumbent user can be allocated to stochastic
arrivals of network slicing requests. Each request arrives with priority
(weight), throughput, computational resource, and latency (deadline)
requirements. As online algorithms, the greedy and myopic solutions that
do not consider heterogeneity of future requests and their arrival
process become ineffective for network slicing. Therefore, reinforcement
learning solutions (Q-learning and Deep Q-learning) are presented to
maximize the network utility in terms of the total weight of granted
network slicing requests over a time horizon, subject to communication
and computational constraints. Results show that reinforcement learning
provides improvements in the 5G network utility relative to myopic,
greedy, random, and first come first served solutions. In particular,
deep Q-learning reduces the complexity and allows practical
implementation as the state-action space grows, and effectively
admits/rejects requests when 5G needs to share the spectrum with
incumbent users that may dynamically occupy some of the frequency-time
blocks. Furthermore, the robustness of deep reinforcement learning is
demonstrated in the presence of the misdetection/false alarm errors in
detecting the incumbent user’s activity.