Adversarial Detection by Approximation of Ensemble Boundary
A spectral approximation of a Boolean function is proposed for approximating the decision boundary of an ensemble of Deep Neural Networks (DNNs) solving two-class pattern recognition problems. The Walsh combination of relatively weak DNN classifiers is shown experimentally to be capable of detecting adversarial attacks. By observing the difference in Walsh coefficient approximation between clean and adversarial images, it appears that transferability of attack may be used for detection. Approximating the decision boundary may also aid in understanding the learning and transferability properties of DNNs. While the experiments here use images, the proposed approach of modelling two-class ensemble decision boundaries could in principle be applied to any application area.
Email Address of Submitting Authort.email@example.com
ORCID of Submitting Author0000-0002-5058-9701
Submitting Author's Institutionuniversity of surrey
Submitting Author's Country
- United Kingdom