loading page

A hardware-friendly low-bit post-training quantization algorithm for lightweight networks
  • +2
  • Jixing Li ,
  • Zhiyuan Zhao ,
  • Gang Chen ,
  • Ming Jin ,
  • Huaxiang Lu
Jixing Li
INSTITUTE OF SEMICONDUCTORS CAS

Corresponding Author:[email protected]

Author Profile
Zhiyuan Zhao
Author Profile
Gang Chen
Author Profile
Huaxiang Lu
Author Profile

Abstract

Lightweight convolutional neural networks (LCNNs) are commonly quantized and deployed on edge devices to fulfill the requirements of low-power, high-performance tasks. Utilizing uniform, symmetric, per-tensor quantization with power-of-two scale factors can enhance hardware efficiency, albeit at the cost of significant accuracy degradation. In this work, we present a hardware-friendly post-training quantization (HFPTQ) framework to address this problem. HFPTQ employs a synergistic combination of established techniques, including Cross-Layer Equalization (CLE), Absorbing High Bias (AHB), and Adaptive Rounding. Furthermore, HFPTQ introduces quantization-aware equalization by optimizing the equalization scale and absorption vector to minimize block reconstruction error and network prediction difference. Additionally, HFPTQ introduces an adaptive bias correction mechanism. While optimizing weight rounding, bias term is co-optimized to rectify the mean shift in activations induced by quantization. Experimental results, based on the ImageNet dataset, demonstrate that HFPTQ significantly enhances the low-bit hardware-friendly quantization performance of lightweight networks.