Survey on Multimodal Transformers for Robots
- Kazuki Miyazawa ,
- Takayuki Nagai
Abstract
In recent years, transformers have been attracting considerable
attention in various natural language processing tasks. Recently, they
have been used not only in natural language processes, but also for
processing multimodal data such as images, video, and audio, and their
effectiveness has been demonstrated. The processing of multimodal data
is extremely important in robot intelligence. Therefore, the multimodal
transformers have the potential to contribute to the development of
robotics in various domains. In this paper, we review the application of
transformers to robots and discuss the possibility of transformers
solving the problems in current intelligent robotics.