Abstract
Efficient resource allocation strategies are pivotal in vehicular
communications as connected devices steeply increase in scenarios with
much more stringent requirements. In this work, we propose a deep
reinforcement learning (DRL)-based sequential scheduling approach for
sum-rate maximization in the uplink of intelligent reflecting surface
(IRS)-assisted multiuser (MU) multiple-input multiple-output (MIMO)
vehicular communications. We formulate the scheduling task as a
partially observable Markov decision process (POMDP) and propose a novel
stream-level sequential solution based on the proximal policy
optimization (PPO) algorithm. We consider a realistic imperfect channel
state information (ICSI) model and assess the proposal in several
communication setups comprising both spatially uncorrelated and
correlated links. Simulation results show that the proposed DRL-based
sequential scheduling approach is a robust alternative to more
computationally demanding benchmarks.