TechRxiv
Research Paper - Neural Layer Bypassing Network.pdf (398.85 kB)
Download file

Neural Layer Bypassing Network

Download (398.85 kB)
preprint
posted on 11.01.2022, 15:29 by Amogh PalasamudramAmogh Palasamudram

This research introduces and evaluates the Neural Layer Bypassing Network (NLBN), a new neural network architecture to improve the speed and effectiveness of forward propagation in deep learning. This architecture utilizes 1 additional (fully connected) neural network layer after every layer in the main network. This new layer determines whether finishing the rest of the forward propagation is required to predict the output of the given input. To test the effectiveness of the NLBN, I programmed coding examples for this architecture with 3 different image classification models trained on 3 different datasets: MNIST Handwritten Digits Dataset, Horses or Humans Dataset, and Colorectal Histology Dataset. After training 1 standard convolutional neural network (CNN) and 1 NLBN per dataset (both of equivalent architectures), I performed 5 trials per dataset to analyze the performance of these two architectures. For the NLBN, I also collected data regarding the accuracy, time period, and speed of the network with respect to the percentage of the model the inputs are passed through. It was found that this architecture increases the speed of forward propagation by 6% - 25% while the accuracy tended to decrease by 0% - 4%; the results vary based on the dataset and structure of the model, but the increase in speed was normally at least twice the decrease in accuracy. In addition to the NLBN’s performance during predictions, it takes roughly 40% longer to train and requires more memory due to its complexity. However, the architecture can be made more efficient if integrated into TensorFlow libraries. Overall, by being able to autonomously skip neural network layers, this architecture can potentially be a foundation for neural networks to teach themselves to become more efficient for applications that require fast, accurate, and less computationally intensive predictions.

History

Email Address of Submitting Author

amogh.p.214@gmail.com

Submitting Author's Institution

R42 Institute

Submitting Author's Country

United States of America

Usage metrics

Licence

Exports