TechRxiv
1_ieee_manuscript_12022021.pdf (7.8 MB)
Download file

Step size self-adaptation for SGD

Download (7.8 MB)
preprint
posted on 06.04.2021, 14:45 by Ilona Kulikovskikh, Tarzan Legović

Convergence and generalization are two crucial aspects of performance in neural networks. When analyzed separately, these properties may lead to contradictory results. Optimizing a convergence rate yields fast training, but does not guarantee the best generalization error. To avoid the conflict, recent studies suggest adopting a moderately large step size for optimizers, but the added value on the performance remains unclear. We propose the LIGHT function with the four configurations which regulate explicitly an improvement in convergence and generalization on testing. This contribution allows to: 1) improve both convergence and generalization of neural networks with no need to guarantee their stability; 2) build more reliable and explainable network architectures with no need for overparameterization. We refer to it as step size self-adaptation.

History

Email Address of Submitting Author

kulikovskikh.im@ssau.ru

ORCID of Submitting Author

0000-0002-6653-5978

Submitting Author's Institution

Samara University

Submitting Author's Country

Russian Federation