loading page

Training vision transformer with gradient centralization optimizer for Alzhemier's disease small dataset increase the diagnostic accuracy
  • Uttam Khatri,
  • Goo-Rak Kwon
Uttam Khatri
Dept. of Information and Communication Engineering, Chosun University

Corresponding Author:[email protected]

Author Profile
Goo-Rak Kwon
Dept. of Information and Communication Engineering, Chosun University


Increasingly common in the aging population, Alzheimer's disease (AD) is a neurodegenerative disorder. Early identification and care are the best ways to prevent AD. In several diagnostic imaging classifications and multiple groups of medical images, transformers (ViT) have recently shown classification results that are superior to CNN. ViT, which tracks the direct associations between images, may be more useful for brain image analysis than CNN since the brain is a complicated system with interconnected parts. Traditional ViT is unable to attend the target class efficiently due to iterative attention brought on by a large constant temperature factor and inductive bias. This work suggests shifted Patch Tokenization (SPT) and position encoding using CoordConv Position Encoding (CPE) to reduce the locality inductive bias of ViT. Moreover, we propose a gradient centralization technique with Adam optimizer for better and faster training. We demonstrate qualitatively how each strategy serves a more crucial setting and helps to identify Alzheimer's disease. The findings of the experiment show that using the suggested approach for distinguishing AD from HC yielded a classification accuracy of 92.30% with a sensitivity of 95.31% and a specificity of 91.45%, which made it the most cutting-edge technique in terms of diagnostic accuracy. These findings have shown the clinical relevance of the suggested approaches for identifying AD and have proven their efficacy.
11 Jan 2024Submitted to TechRxiv
22 Jan 2024Published in TechRxiv