TechRxiv
Pipe_All_Short.pdf (198.78 kB)

Integrated ARM big.Little-Mali Pipeline for High-Throughput CNN Inference

Download (198.78 kB)
preprint
posted on 20.07.2021, 16:09 by Ehsan Aghapour
State-of-the-art Heterogeneous System on Chips (HMPSoCs) can perform on-chip embedded inference on its CPU and GPU. Multi-component pipelining is the method of choice to provide high-throughput Convolutions Neural Network (CNN) inference on embedded platforms. In this work, we provide details for the first CPU-GPU pipeline design for CNN inference called Pipe-All. Pipe-All uses the ARM-CL library to integrate an ARM big.Little CPU with an ARM Mali GPU. Pipe-All is the first three-stage CNN inference pipeline design with ARM’s big CPU cluster, Little CPU cluster, and Mali GPU as its stages. Pipe-All provides on average 75.88% improvement in inference throughput (over peak single-component inference) on Amlogic A311D HMPSoC in Khadas Vim 3 embedded platform. We also provide an open-source implementation for Pipe-All.
This paper is submitted to IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) as a transaction brief paper (5 pages).

History

Email Address of Submitting Author

aghapour.ehsan17@gmail.com

ORCID of Submitting Author

0000-0002-0291-7555

Submitting Author's Institution

University of Amsterdam

Submitting Author's Country

Netherlands