TechRxiv
LiCaNet_v3.pdf (15.47 MB)

LiCaNet: Further Enhancement of Joint Perception and Motion Prediction based on Multi-Modal Fusion

Download (15.47 MB)
preprint
posted on 08.09.2021, 13:11 by Yasser KhalilYasser Khalil, Hussein T. Mouftah

The safety and reliability of autonomous driving pivots on the accuracy of perception and motion prediction pipelines, which in turn reckons primarily on the sensors deployed onboard. Slight confusion in perception and motion prediction can result in catastrophic consequences due to misinterpretation in later pipelines. Therefore, researchers have recently devoted considerable effort towards developing accurate perception and motion prediction models. To that end, we propose LIDAR Camera network (LiCaNet) that leverages multi-modal fusion to further enhance the joint perception and motion prediction performance accomplished in our earlier work. LiCaNet expands on our previous fusion network by adding a camera image to the fusion of RV image with historical BEV data sourced from a LIDAR sensor. We present a comprehensive evaluation to validate the outstanding performance of LiCaNet compared to the state-of-the-art. Experiments reveal that utilizing a camera sensor results in a substantial perception gain over our previous fusion network and a steep reduction in displacement errors. Moreover, the majority of the achieved improvement falls within camera range, with the highest registered for small and distant objects, confirming the significance of incorporating a camera sensor into a fusion network.

History

Email Address of Submitting Author

ykhal038@uottawa.ca

ORCID of Submitting Author

0000-0002-6632-6068

Submitting Author's Institution

University of Ottawa

Submitting Author's Country

Canada