Document Analysis and Recognition: A survey
The journey of research for Document Analysis and Recognition (DAR) started with the problem of automatic character recognition. Today, it has covered a vast span of recognition tasks such as text recognition, script identification, writer identification, word spotting, signature verification etc., in scripts like Roman, Arabic, Chinese, Japanese, Brahmi etc. Extensive advancements in deep learning techniques have achieved state-of-the-art results for various DAR tasks. This work explores the challenges from different perspectives and reviews the techniques for online/offline and printed/handwritten DAR tasks. We examine the research works with the view of script-related challenges. Various datasets for DAR are categorized into historic, printed and handwritten datasets. We present a comprehensive survey of challenges, techniques, datasets, evaluation metrics, script-related perspectives and potential future directions for DAR.
History
Email Address of Submitting Author
rsi2018506@iiita.ac.inORCID of Submitting Author
0000-0001-5393-1202Submitting Author's Institution
Indian Institute of Information Technology, AllahabadSubmitting Author's Country
- India