Assessing the Performance of Reinforcement Learning on Passive RRAM
Crossbar Array
Abstract
Reinforcement learning is a promising approach that can allow machines
to acquire knowledge and solve problems without the intervention of
humans. However, the current implementation of reinforcement learning
algorithms on standard complementary metal-oxide-semiconductor based
platform constraints the performance due to von Neumann architecture,
which leads to increased energy consumption and latency. To this end, in
this work, we propose an extremely area- and energy-efficient
implementation of Monte Carlo learning on passive resistive random
access memory (RRAM) crossbar array considering the non-ideal hardware
artifacts such as device-to-device variation, noise and endurance
failure. To illustrate the capabilities of our implementation, we
considered the classical control problem of cart-pole. Our results
indicate that the proposed passive RRAM crossbar-based implementation of
Monte Carlo learning not only outperforms prior digital and active 1
Transistor - 1 RRAM (1T1R) crossbar-based implementation by more than
five orders of magnitude in terms of area but is also robust against
spatial and temporal variations and endurance failure of RRAM devices.