loading page

Transport Triggered Array Processor for Vision Applications: Near-threshold Performance Loss Compensation Through Inherent Parallelism of Vision Array Processors
  • Mehdi Safarpour
Mehdi Safarpour
Author Profile


Operating at reduced voltages promises substantial energy efficiency improvement, however the downside is significant down-scaling of clock frequency. This paper propose vision chips as excellent fit for low-voltage operation. Low-level sensory data processing in many Internet-of-Things (IoT) devices pursue energy efficiency by utilizing sleep modes or slowing the clocking to the minimum. To curb the share of stand-by power dissipation in those designs, near-threshold/sub-threshold operational points or ultra-low-leakage processes in fabrication are employed. Those limit the clocking rates significantly, reducing the computing throughputs of individual processing cores. In this contribution we explore compensating for the performance loss of operating in near-threshold region ($V_{dd}=$0.6V) through massive parallelization. Benefits of near-threshold operation and massive parallelism are optimum energy consumption per instruction operation and minimized memory round-trips, respectively. The Processing Elements (PE) of the design are based on Transport Triggered Architecture. The fine grained programmable parallel solution allows for fast and efficient computation of learnable low-level features (e.g. local binary descriptors and convolutions). Other operations, including Max-pooling have also been implemented. The programmable design achieves excellent energy efficiency for Local Binary Patterns computations.
Our results demonstrates that the inherent properties of chip processor and vision applications allow voltage and clock frequency aggressively without having to compromise performance.