loading page

Document Analysis and Recognition: A survey
  • Shekhar Verma ,
  • P Nagabhushan
Indian Institute of Information Technology

Corresponding Author:[email protected]

Author Profile
Shekhar Verma
Author Profile
P Nagabhushan
Author Profile


The journey of research for Document Analysis and Recognition (DAR) started with the problem of automatic character recognition. Today, it has covered a vast span of recognition tasks such as text recognition, script identification, writer identification, word spotting, signature verification etc., in scripts like Roman, Arabic, Chinese, Japanese, Brahmi etc. Extensive advancements in deep learning techniques have achieved state-of-the-art results for various DAR tasks. This work explores the challenges from different perspectives and reviews the techniques for online/offline and printed/handwritten DAR tasks. We examine the research works with the view of script-related challenges. Various datasets for DAR are categorized into historic, printed and handwritten datasets. We present a comprehensive survey of challenges, techniques, datasets, evaluation metrics, script-related perspectives and potential future directions for DAR.