loading page

Small Text Extraction from Documents and Chart Images
  • Rominkumar Busa ,
  • shahira k. c. ,
  • Lijiya A.
Rominkumar Busa
National Institute of Technology Calicut

Corresponding Author:[email protected]

Author Profile
shahira k. c.
Author Profile
Lijiya A.
Author Profile

Abstract

Text recognition is an important area in computer vision which deals with detecting and recognizing text from an image. Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied to text with small font size like the text data of chart images, the recognition rate is less than 30%. In this work, Our aim is to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We observed that the text recognition rate increases by 24% by applying the proposed method, which involves super resolution and character segmentation using projection profile followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images.