Approximation-Aware and Quantization-Aware Training for Graph Neural Networks
Graph Neural Networks (GNNs) are one of the best-performing models for processing graph data. They are known to have considerable computational complexity, despite the smaller number of parameters compared to traditional Deep Neural Networks (DNNs). Operations-to-parameters ratio for GNNs can be tens and hundreds of times higher than for DNNs, depending on the input graph size. This complexity indicates the importance of arithmetic operation optimization within GNNs through model quantization and approximation. In this work, for the first time, we combine both approaches and implement quantization- and approximation-aware training for GNNs to sustain their accuracy under the errors induced by inexact multiplications. We employ matrix multiplication CUDA kernel to speed up the simulation of approximate multiplication within GNNs. Further, we demonstrate the execution speed, accuracy, and energy efficiency of GNNs with approximate multipliers in comparison with quantized low-bit GNNs. We evaluate the performance of state-of-the-art GNN architectures (i.e., GIN, SAGE, GCN, and GAT) on various datasets and tasks (i.e., Reddit-Binary, Collab for graph classification, Cora and PubMed for node classification) with a wide range of approximate multipliers.
History
Email Address of Submitting Author
novkinrn@iti.uni-stuttgart.deORCID of Submitting Author
0009-0006-6632-9804Submitting Author's Institution
Stuttgart UniversitySubmitting Author's Country
- Germany