TechRxiv
Download file
Download file
Download file
Download file
Download file
Download file
Download file
Download file
1/1
8 files

Document Analysis and Recognition: A survey

Download all (3.47 MB)
preprint
posted on 2023-03-27, 05:57 authored by SHIVANGI NIGAMSHIVANGI NIGAM, Shekhar Verma, P Nagabhushan

The journey of research for Document Analysis and Recognition (DAR) started with the problem of automatic character recognition. Today, it has covered a vast span of recognition tasks such as text recognition, script identification, writer identification, word spotting, signature verification etc., in scripts like Roman, Arabic, Chinese, Japanese, Brahmi etc. Extensive advancements in deep learning techniques have achieved state-of-the-art results for various DAR tasks. This work explores the challenges from different perspectives and reviews the techniques for online/offline and printed/handwritten DAR tasks. We examine the research works with the view of script-related challenges. Various datasets for DAR are categorized into historic, printed and handwritten datasets. We present a comprehensive survey of challenges, techniques, datasets, evaluation metrics, script-related perspectives and potential future directions for DAR.

History

Email Address of Submitting Author

rsi2018506@iiita.ac.in

ORCID of Submitting Author

0000-0001-5393-1202

Submitting Author's Institution

Indian Institute of Information Technology, Allahabad

Submitting Author's Country

  • India