CityLearn: Diverse Real-World Environments for Sample-Efficient
Navigation Policy Learning
Abstract
Visual navigation tasks in real-world environments often require both
self-motion and place recognition feedback. While deep reinforcement
learning has shown success in solving these perception and
decision-making problems in an end-to-end manner, these algorithms
require large amounts of experience to learn navigation policies from
high-dimensional data, which is generally impractical for real robots
due to sample complexity. In this paper, we address these problems with
two main contributions. We first leverage place recognition and deep
learning techniques combined with goal destination feedback to generate
compact, bimodal image representations that can then be used to
effectively learn control policies from a small amount of experience.
Second, we present an interactive framework, CityLearn, that enables for
the first time training and deployment of navigation algorithms across
city-sized, realistic environments with extreme visual appearance
changes. CityLearn features more than 10 benchmark datasets, often used
in visual place recognition and autonomous driving research, including
over 100 recorded traversals across 60 cities around the world. We
evaluate our approach on two CityLearn environments, training our
navigation policy on a single traversal. Results show our method can be
over 2 orders of magnitude faster than when using raw images, and can
also generalize across extreme visual changes including day to night and
summer to winter transitions.