Faster ISNet for Background Bias Mitigation on Deep Neural Networks

Image background features can constitute background bias (spurious correlations) and impact deep classifiers decisions, causing shortcut learning (Clever Hans effect) and reducing the generalization skill on real-world data. The concept of optimizing Layer-wise Relevance Propagation (LRP) heatmaps, to improve classifier behavior, was recently introduced by a neural network architecture named ISNet. It minimizes background relevance in LRP maps, to mitigate the influence of image background features on deep classifiers decisions, hindering shortcut learning and improving generalization. For each training image, the original ISNet produces one heatmap per possible class in the classification task, hence, its training time scales linearly with the number of classes. Here, we introduce reformulated architectures that allow the training time to become independent from this number, rendering the optimization process much faster. We challenged the enhanced models utilizing the MNIST dataset with synthetic background bias, and COVID-19 detection in chest X-rays, an application that is prone to shortcut learning due to background bias. The trained models minimized background attention and hindered shortcut learning, while retaining high accuracy. Considering external (out-of-distribution) test datasets, they consistently proved more accurate than multiple state-of-the-art deep neural network architectures, including a dedicated image semantic segmenter followed by a classifier. The architectures presented here represent a potentially massive improvement in training speed over the original ISNet, thus introducing LRP optimization into a gamut of applications that could not be feasibly handled by the original model.

翻译：图像背景特征可能构成背景偏置（虚假相关性），影响深度分类器的决策，导致捷径学习（克利弗·汉斯效应），并降低其在真实世界数据上的泛化能力。近期，一种名为ISNet的神经网络架构引入了通过优化逐层相关性传播热图来改善分类器行为的方法。该技术通过最小化LRP热图中的背景相关性，从而削弱图像背景特征对深度分类器决策的影响，抑制捷径学习并提升泛化性能。在原始ISNet中，每个训练图像需为分类任务中的每个可能类别生成热图，因此其训练时间与类别数呈线性关系。本文提出经过重构的架构，使训练时间不再依赖于类别数量，从而显著加速优化过程。我们利用含合成背景偏置的MNIST数据集，以及在易因背景偏置产生捷径学习的胸部X光COVID-19检测任务中，对所提出的增强模型进行验证。训练后的模型能有效降低对背景的关注，抑制捷径学习，同时保持高准确率。在外部（分布外）测试数据集上，这些模型始终比多种当前最优深度神经网络架构（包括专用图像语义分割器与分类器组合）表现更优。本文提出的架构相较原始ISNet在训练速度上具有潜在的大幅提升，从而将LRP优化引入原始模型无法高效处理的各类应用场景。