TechRxiv
revision.pdf (916.25 kB)
Download file

PURIFYING ADVERSARIAL IMAGES USING ADVERSARIAL AUTOENCODERWITH CONDITIONAL NORMALIZING FLOWS

Download (916.25 kB)
preprint
posted on 2023-05-11, 20:01 authored by Yi JiYi Ji, Trung-Nghia Le, Huy H. Nguyen, Isao Echizen

We present a target-agnostic adversarial autoencoder with conditional normalizing flows specifically designed to, given any unlabeled image dataset, purify adversarial samples into clean images, i.e., remove adversarial noise from the images while preserving their visual quality. In our model interpretation, samples are processed by manifold projection in which the encoder brings the sample back into a posterior data distribution in latent space so that the sample is less likely to be irregular to the learned representation of any target classifier. Normalizing flows conditioned on top of our hybrid network structure and walk-back training are used to deal with common drawbacks of generative model and autoencoder-based approaches: not only the trade-off between compression loss and over-fitting on training data but also the structural model dependency on dataset classes and labels. Experiments demonstrated that our proposed model is preferable to existing target-agnostic adversarial defense methods particularly for large and unlabeled image datasets.

Funding

JSPS KAKENHI Grants JP18H04120, JP20K23355, JP21H04907, and JP21K18023

JST CREST Grants JPMJCR18A6 and JPMJCR20D3

History

Email Address of Submitting Author

jiyi@g.ecc.u-tokyo.ac.jp

ORCID of Submitting Author

0000-0001-9134-9598

Submitting Author's Institution

The University of Tokyo

Submitting Author's Country

  • Japan

Usage metrics

    Licence

    Exports