loading page

UAV Coverage Path Planning with Quantum-based Deep Deterministic Policy Gradient
  • Silvirianti Silvirianti ,
  • Bhaskara Narottama ,
  • Soo Young Shin
Silvirianti Silvirianti
Kumoh National Institute of Technology, Kumoh National Institute of Technology

Corresponding Author:[email protected]

Author Profile
Bhaskara Narottama
Author Profile
Soo Young Shin
Author Profile

Abstract

This study proposes quantum-based deep deterministic policy gradient (Q-DDPG) and quantum-based recurrent DDPG (Q-RDDPG) schemes for time-series optimization in UAV communications. Herein, Q-DDPG based actor-critic reinforcement learning is utilized to optimize action selections in a large state and continuous action space. In this scheme, quantum models are exploited to reduce computational complexity and training loss. As a particular case, Q-DDPG and Q-RDDPG are employed for trajectory optimization and dynamic resource allocation in UAV communications. Quantum circuits of the Q-DDPG schemes are described to showcase their implementation in noisy intermediate-scale quantum (NISQ) computers. The results demonstrate that Q-DDPG and Q-RDDPG schemes achieved higher rewards with lower training losses compared to classical DDPG.