TechRxiv
manuscript_TPAMI_arvix.pdf (57.24 MB)
Download file

IG2: Integrated Gradient on Iterative Gradient Path for eXplainable AI

Download (57.24 MB)
preprint
posted on 2023-09-11, 16:18 authored by Yue ZhuoYue Zhuo, Zhiqiang Ge

Feature attribution explains Artificial Intelligence (AI) at the instance level by providing importance scores of input features' contributions to the model prediction. Integrated Gradients (IG) is the prevalent path attribution method for deep neural networks, which integrates gradients along a path between the explained input (explicand) and a counterfactual instance called the baseline. However, the existing variant IG-based methods only consider the gradient of explicand's output, but we find that the gradient of counterfactual output also has a significant effect on feature attribution. To achieve this, we propose \underline{I}terative \underline{G}radient path \underline{I}ntegrated \underline{G}radients (IG2), considering both two gradients. IG2 incorporates the counterfactual gradient iteratively into the integration path, consequently obtaining a novel path (\emph{GradPath}) and a novel baseline (\emph{GradCF}). These two novel essential IG components significantly solve the problems of attribution noise and arbitrary baseline choice in previous IG methods. As a path method, IG2 stratifies many desirable axioms, which are theoretically justified in the paper. The experiments are built on the synthetic tabular XAI benchmark and multiple real-world datasets, including classification tasks of ImageNet, TREC questions, and wafer map failure patterns. The qualitative and quantitative results validate that IG2 provides superior feature attributions to the previous attribution techniques.

History

Email Address of Submitting Author

zhuoy1995@zju.edu.cn

Submitting Author's Institution

Zhejiang University

Submitting Author's Country

  • China