loading page

Quantized Magnetic Domain Wall Synapse for Efficient Deep Neural Networks
  • +2
  • Seema Dhull ,
  • Walid Misba ,
  • Arshid Nisar ,
  • Jayasimha Atulasimha ,
  • Brajesh Kumar Kaushik
Seema Dhull
Indian Institute of Technology Roorkee

Corresponding Author:[email protected]

Author Profile
Walid Misba
Author Profile
Arshid Nisar
Author Profile
Jayasimha Atulasimha
Author Profile
Brajesh Kumar Kaushik
Author Profile


Quantization of synaptic weights using emerging non-volatile memory devices has emerged as a promising solution to implement computationally efficient neural networks on resource constrained hardware. However, the practical implementation of such synaptic weights is hampered by the imperfect memory characteristics, specifically the availability of limited number of quantized states and the presence of large intrinsic device variation and stochasticity involved in writing the synaptic state. This article presents on-chip training and inference of a neural network using quantized magnetic domain wall (DW) based synaptic array and CMOS peripheral circuits.  The quantization of the synaptic device offers a balanced neural network performance in terms of accuracy, power, and area. A rigorous model of the magnetic DW device considering stochasticity and process variations has been utilized for the synapse. To achieve stable quantized weights, DW pinning has been achieved by means of physical constrictions. Finally, VGG8 architecture for CIFAR-10 image classification has been simulated by using the extracted synaptic device characteristics. The performance in terms of accuracy, energy, latency, and area consumption has been evaluated while considering the process variations and non-idealities in the DW device as well as the peripheral circuits. The proposed quantized neural network architecture achieves efficient on-chip learning with 92.4% and 90.4 % training and inference accuracy, respectively. In comparison to pure CMOS based design, it demonstrates an overall improvement in area, energy, and latency by 13.8×, 9.6×, and 3.5×, respectively.