Deep Learning for Omnidirectional Vision: A Survey and New Perspectives
In this paper, we provide a comprehensive and systematic review and analysis of the recent progress in DL methods for omnidirectional vision. There are some previous surveys in the literature. However, some of them are focused on the specific vision tasks, especially room layout reconstruction, 3D scene geometry recovery. Moreover, some provide limited reviews of omnidirectional video streaming methods. Unlike existing surveys, this paper highlights the importance of deep learning and probe the recent advances for omnidirectional vision, both methodically and comprehensively. To the best of our knowledge, this is the first survey to comprehensively review and analyze the DL methods for omnidirectional vision.
The main contributions of this paper to the community are five folds:
(I) We comprehensively review and analyze the DL methods for omnidirectional vision, including the omnidirectional imaging principle, representation learning, datasets, a taxonomy, and applications, to highlight the differences and difficulties with the 2D planner image data.
(II) We conduct an analytical study of recent trends of DL for omnidirectional vision, both hierarchically and structurally. Moreover, we offer insights into the discussion and challenge of each category.
(III) We summarize the latest novel learning strategies and potential applications for omnidirectional vision.
(I) We provide insightful discussions of the challenges and open problems yet to be solved and propose the potential future directions to spur more in-depth research by the community.
(II) We create an open-source repository that provides a taxonomy of all the mentioned works and code links, and hope it can shed light on future research.
The organization of this paper is structured as follows. In Sec.2, we introduce the imaging principle of ODI, convolution methods for omnidirectional vision, and some representative datasets. In Sec.3, we review and analyze the existing DL approaches for various tasks and provide taxonomies to categorize the relevant papers. Sec.4 covers novel learning paradigms for the tasks in omnidirectional vision, e.g., unsupervised learning, transfer learning, and reinforcement learning. Sec.5 then scrutinizes the applications, followed by Sec. 6, where we discuss open problems and future directions. Finally, we conclude this paper in Sec. 7. Furthermore, we summarize most, if not all but representative, works (over 200 papers) in the last five years, which were published in the top-tier conferences and journals in computer vision/graphics and machine learning.