loading page

A Multiply-And-Max/min Neuron Paradigm for Aggressively Prunable Deep Neural Networks
  • +4
  • Luciano Prono,
  • Philippe Bich,
  • Chiara Boretti,
  • Mauro Mangia,
  • Fabio Pareschi,
  • Riccardo Rovatti,
  • Gianluca Setti
Luciano Prono
Politecnico di Torino

Corresponding Author:[email protected]

Author Profile
Philippe Bich
Author Profile
Chiara Boretti
Author Profile
Mauro Mangia
Author Profile
Fabio Pareschi
Author Profile
Riccardo Rovatti
Author Profile
Gianluca Setti
Author Profile

Abstract

The growing interest in Internet of Things and mobile Artificial Intelligence applications is pushing the investigation on Deep Neural Networks (DNNs) that can operate at the edge using low-resources/energy devices.
To obtain such a goal, several pruning techniques have been proposed in the literature. They aim to reduce the number of interconnections - and consequently the size, and the corresponding computing and storage requirements - of DNNs that traditionally rely on classic Multiply-and-ACcumulate (MAC) neurons.
In this work, we propose a novel neuron structure based on a Multiply-And-Max/min (MAM) map-reduce paradigm, and we show that by exploiting this new paradigm it is possible to build naturally and aggressively prunable DNN layers, with a negligible loss in performance. This novel structure allows a greater interconnection sparsity when compared to classic MAC-based DNN layers. Moreover, most of the already existing state-of-the-art pruning techniques can be used with MAM layers with little to no changes.
To test the pruning performance of MAM, we employ different models - AlexNet, VGG-16 and the more recent ViT-B/16 - and different computer vision datasets - CIFAR-10, CIFAR-100 and ImageNet-1K. Multiple pruning approaches are applied, ranging from single-shot methods to training-dependent and iterative techniques.
As a notable example, we test MAM on the ViT-B/16 model fine-tuned on the ImageNet-1K task and apply one-shot gradient-based pruning. We remove interconnections until the model experiences a 3% decrease in accuracy. While MAC-based layers need at least 56.16% remaining interconnections, MAM-based layers achieve the same accuracy with only 0.04%.
08 Feb 2024Submitted to TechRxiv
13 Feb 2024Published in TechRxiv