Abstract
Stochastic computing (SC) is attractive for hardware implementation due
to its low complexity in arithmetic unit design; therefore, SC has
attracted considerable interest to implement Artificial Neural Networks
(ANNs) for resources-limited applications, because ANNs must usually
perform a large number of arithmetic operations. To attain a high
computation accuracy in an SC-based ANN, extended stochastic logic is
utilized together with standard SC units and thus, a stochastic divider
is required to perform the conversion between these logic
representations. However, as the most complex SC arithmetic unit, the
conventional divider incurs in a large computation latency; this limits
an SC implementation for ANNs used in applications needing high
performance. Therefore, there is a need to design fast stochastic
dividers for SC-based ANNs. Recent works (e.g., a binary searching and
triple modular redundancy (BS-TMR) based stochastic divider) are
targeting a reduction in computation latency, while keeping nearly the
same accuracy compared with the traditional (conventional) design.
However, this divider still requires N iterations to deal with
2N-bit stochastic sequences, and thus the
latency increases in proportion to the sequence length. In this paper, a
decimal searching and TMR (DS-TMR) based stochastic divider is initially
proposed to further reduce the computation latency; it only requires two
iterations to calculate the quotient, so regardless of the sequence
length. Moreover, a second trade-off design between accuracy and
hardware is also presented. An SC-based Multi-Layer Perceptron (MLP) is
then considered to show the effectiveness of the proposed dividers;
results show that when utilizing the proposed dividers, MLP achieves the
lowest computation latency while keeping the classification results at
the same accuracy. When using as combined metric the product of the
latency and power dissipation, the proposed designs are also shown to be
superior to the SC-based MLPs employing other dividers found in the
technical literature as well as the commonly used 32-bit floating point
implementation. This makes the proposed dividers very attractive
compared with the existing schemes for SC-based ANNs.