We Learn Better Road Pothole Detection: from Attention Aggregation to
Adversarial Domain Adaptation
Abstract
Manual visual inspection, typically performed by certified inspectors,
is still the main form of road pothole detection. This process is,
however, not only tedious, time-consuming and costly, but also dangerous
for the inspectors. Furthermore, the road pothole detection results are
always subjective, because they depend entirely on the inspector’s
experience. In this paper, we first introduce a disparity (or inverse
depth) image processing module, named quasi inverse perspective
transformation (QIPT), which can make the damaged road areas become
highly distinguishable. Then, we propose a novel attention aggregation
(AA) framework, which can improve the semantic segmentation networks for
better road pothole detection, by taking the advantages of different
types of attention modules. Moreover, we develop a novel training set
augmentation technique based on adversarial domain adaptation, where
synthetic road RGB images and transformed road disparity (or inverse
depth) images are generated to enhance the training of semantic
segmentation networks.
The experimental results illustrate that, firstly, the disparity (or
inverse depth) images transformed by our QIPT module become more
informative; secondly, the adversarial domain adaptation can not only
significantly improve the performance of the state-of-the-art semantic
segmentation networks, but also accelerate their convergence. In
addition, AA-UNet and AA-RTFNet, our best performing implementations,
respectively outperform all other state-of-the-art single-modal and
data-fusion networks for road pothole detection.