Deep Active Learning Approach for Traffic Sign and Panel Guide Arabic-Latin Text Content Annotation in Natural Scene Images
The detection and recognition of road traffic signs and panel guides content has become challenging in recent years. Few studies have been made to solve these two issues at the same time especially in Arabic language. Additionally, the limited number of datasets for traffic signs and panel guide content makes the investigation more interesting. In our work, we propose a Deep Active Learning Approach for Traffic Sign and Panel Guide Arabic-Latin Text Content Annotation in Natural Scene Images. Convolution neural network (CNN) and active learning are combined in the proposed system to detect and recognize traffic sign and multilingual scene text from panels guides, particularly those with Arabic Latin characters. The annotation system based on active learning method is applied to the Natural Scene Traffic Sign and Panel Guide Arabic-Latin Text dataset (NaSTSArLaT) of Tunisian highway road. The defined dataset contains initially 3000 collected images with only 181 annotated samples. Progressively, un-annotated training images, are automatically annotated when they provide high confidence with the YOLOv5 detection model and only the less confident examples are manually annotated. The annotation output consists of a class label and its bounding box for every traffic sign and panel guide text content. The performance of the proposed system is done on test data that represents 10% of the available annotated set. The proposed active learning strategy gives 62.7% in terms of mean Average Precision (mAP) only with the manual annotation of about 1⁄4 of the samples, whereas if the learning is performed using all the data after a complete annotation, the obtained performance is 69.2% which is comparable to our approach.