Combining Learnable Low-dimensional Binary Filter Bases for Compressing
Convolutional Neural Networks
Abstract
Existing convolutional neural networks (CNNs) have achieved significant
performance on various real-life tasks, but a large number of parameters
in convolutional layers requires huge storage and computation resources
which makes it difficult to deploy CNNs on memory-constraint embedded
devices. In this paper, we propose a novel compression method that
generates the convolution filters in each layer by combining a set of
learnable low-dimensional binary filter bases. The proposed method
designs more compact convolution filters by stacking the linear
combinations of these filter bases. Because of binary filters, the
compact filters can be represented using less number of bits so that the
network can be highly compressed. Furthermore, we explore the sparsity
of coefficient through L1-ball projection when conducting linear
combination to avoid overfitting. In addition, we analyze the
compression performance of the proposed method in detail. Evaluations on
four benchmark datasets under VGG-16 and ResNet-18 structures show that
the proposed method can achieve a higher compression ratio with
comparable accuracy compared with the existing state-of-the-art filter
decomposition and network quantization methods.