Image Captioning for the Visually Impaired and Blind: A Recipe for
Low-Resource Languages
- Batyr Arystanbekov ,
- Askat Kuzdeuov ,
- Shakhizat Nurgaliyev ,
- Hüseyin Atakan Varol
Askat Kuzdeuov
Institute of Smart Systems and Artificial Intelligence, Institute of Smart Systems and Artificial Intelligence
Corresponding Author:[email protected]
Author ProfileAbstract
Visually impaired and blind people often face a range of socioeconomic
problems that can make it difficult for them to live independently and
participate fully in society. Advances in machine learning pave new
venues to implement assistive devices for the visually impaired and
blind. In this work, we combined image captioning and text-to-speech
technologies to create an assistive device for the visually impaired and
blind. Our system can provide the user with descriptive auditory
feedback in the Kazakh language on a scene acquired in real-time by a
head-mounted camera. The image captioning model for the Kazakh language
provided satisfactory results in both quantitative metrics and
subjective evaluation. Finally, experiments with a visually unimpaired
blindfolded participant demonstrated the feasibility of our approach.