loading page

AdAM: Adaptive Approximate Multiplier for Fault Tolerance in DNN Accelerators
  • +5
  • Mahdi Taheri,
  • Natalia Cherezova,
  • Samira Nazari,
  • Ali Azarpeyvand,
  • Tara Ghasempouri,
  • Masoud Daneshtalab,
  • Jaan Raik,
  • Maksim Jenihhin
Mahdi Taheri
Tallinn University of Technology

Corresponding Author:[email protected]

Author Profile
Natalia Cherezova
Tallinn University of Technology
Samira Nazari
University of Zanjan
Ali Azarpeyvand
University of Zanjan, Tallinn University of Technology
Tara Ghasempouri
Tallinn University of Technology
Masoud Daneshtalab
Mälardalen University, Tallinn University of Technology
Jaan Raik
Tallinn University of Technology
Maksim Jenihhin
Tallinn University of Technology

Abstract

Deep Neural Network (DNN) hardware accelerators are essential in a spectrum of safety-critical edge-AI applications with stringent reliability, energy efficiency, and latency requirements. Multiplication is the most resource-hungry operation in the neural network's processing elements. This paper proposes a scalable adaptive fault-tolerant approximate multiplier (AdAM) tailored for ASIC-based DNN accelerators at the algorithm and circuit levels. AdAM employs an adaptive adder that relies on an unconventional use of input Leading One Detector (LOD) values for fault detection by optimizing unutilized adder resources. A gate-level optimized LOD design and a hybrid adder design are also proposed as a part of the adaptive multiplier to improve the hardware performance. The proposed architecture uses a lightweight fault mitigation technique that sets the detected faulty bits to zero. The hardware resource utilization and the DNN accelerator's reliability metrics are used to compare the proposed solution against the Triple Modular Redundancy (TMR) in multiplication, unprotected exact multiplication, and unprotected approximate multiplication. It is demonstrated that the proposed architecture enables a multiplication with a reliability level close to the multipliers protected by TMR while at the same time utilizing 2.74 × less area and with 39.06% less power-delay product compared to the exact multiplier. Moreover, it has similar area, delay, and power consumption parameters compared to the state-of-the-art approximate multipliers with similar accuracy while providing fault detection and mitigation capability.
02 May 2024Submitted to TechRxiv
06 May 2024Published in TechRxiv