manuscript_foveated_transformers.pdf (2.68 MB)
Download file

What You See Is What You Transform: Foveated Spatial Transformers as a bio-inspired attention mechanism

Download (2.68 MB)
posted on 2021-09-07, 21:28 authored by Ghassan DabaneGhassan Dabane, Laurent PerrinetLaurent Perrinet, Emmanuel Daucé
Convolutional Neural Networks have been considered the go-to option for object recognition in computer vision for the last couple of years. However, their invariance to object’s translations is still deemed as a weak point and remains limited to small translations only via their max-pooling layers. One bio-inspired approach considers the What/Where pathway separation in Mammals to overcome this limitation. This approach works as a nature-inspired attention mechanism, another classical approach of which is Spatial Transformers. These allow an adaptive endto-end learning of different classes of spatial transformations throughout training. In this work, we overview Spatial Transformers as an attention-only mechanism and compare them with the What/Where model. We show that the use of attention restricted or “Foveated” Spatial Transformer Networks, coupled alongside a curriculum learning training scheme and an efficient log-polar visual space entry, provides better performance when compared to the What/Where model, all this without the need for any extra supervision whatsoever.


Email Address of Submitting Author

ORCID of Submitting Author


Submitting Author's Institution

Institut de Neurosciences de la Timone

Submitting Author's Country

  • France