Deep neural networks have been shown to learn and rely on spurious correlations present in the data that they are trained on. Reliance on such correlations can cause these networks to malfunction when deployed in the real world, where these correlations may no longer hold. To overcome the learning of and reliance on such correlations, recent studies propose approaches that yield promising results. These works, however, study settings where the strength of the spurious signal is significantly greater than that of the core, invariant signal, making it easier to detect the presence of spurious features in individual training samples and allow for further processing. In this paper, we identify new settings where the strength of the spurious signal is relatively weaker, making it difficult to detect any spurious information while continuing to have catastrophic consequences. We also discover that spurious correlations are learned primarily due to only a handful of all the samples containing the spurious feature and develop a novel data pruning technique that identifies and prunes small subsets of the training data that contain these samples. Our proposed technique does not require inferred domain knowledge, information regarding the sample-wise presence or nature of spurious information, or human intervention. Finally, we show that such data pruning attains state-of-the-art performance on previously studied settings where spurious information is identifiable.
翻译:深度神经网络已被证明会学习并依赖于训练数据中存在的虚假相关性。对这些相关性的依赖可能导致这些网络在部署到现实世界时发生故障,因为现实世界中这些相关性可能不再成立。为克服对此类相关性的学习与依赖,近期研究提出了多种具有前景的方法。然而,这些研究主要关注虚假信号强度显著大于核心不变信号强度的场景,这使得更容易在单个训练样本中检测虚假特征的存在并进行后续处理。本文识别了虚假信号强度相对较弱的新场景,这些场景中难以检测任何虚假信息,却仍会导致灾难性后果。我们还发现,虚假相关性的学习主要源于仅少数包含虚假特征的样本,并据此开发了一种新颖的数据剪枝技术,该技术能够识别并剪枝包含这些样本的小规模训练数据子集。我们提出的技术无需推断领域知识、样本层面虚假信息的存在性或性质信息,也无需人工干预。最后,我们证明此类数据剪枝技术在先前研究的可识别虚假信息场景中达到了最先进的性能。