Shortcut Invariance: Targeted Jacobian Regularization in Disentangled Latent Space

Deep neural networks are prone to learning shortcuts, spurious correlations present in the training data that undermine out-of-distribution (OOD) generalization. Most prior work mitigates shortcut learning through input-space reweighting, either relying on explicit shortcut labels or inferring shortcut structure from heuristics such as per-sample loss. Moreover, these approaches typically assume the presence of some shortcut-conflicting examples in the training set, an assumption that is often violated in practice, particularly in medical imaging where data is aggregated across institutions with different acquisition protocols. We propose a latent-space method that views shortcut learning as over-reliance on shortcut-aligned axes. In a disentangled latent space, we identify candidate shortcut-aligned axes via their strong correlation with labels and reduce classifier reliance on them by injecting targeted anisotropic noise during training. Unlike prior latent-space based approaches that remove, project out, or adversarially suppress shortcut features, our method preserves the full representation and instead impose functional invariance by regularizing the classifier's sensitivity along those axes. We show that injecting anisotropic noise induces targeted Jacobian and curvature regularization, effectively flattening the decision boundary along shortcut axes while leaving core feature dimensions largely unaffected. Our method achieves state-of-the-art OOD performance across standard shortcut-learning benchmarks without requiring shortcut labels or shortcut-conflicting samples.

翻译：深度神经网络容易学习捷径，即训练数据中存在的虚假相关性，这会损害分布外泛化能力。先前的研究大多通过输入空间重加权来缓解捷径学习，要么依赖显式的捷径标签，要么根据启发式方法（如逐样本损失）推断捷径结构。此外，这些方法通常假设训练集中存在某些与捷径冲突的样本，这一假设在实践中往往不成立，尤其是在医学影像领域，因为数据通常来自具有不同采集协议的多家机构。我们提出一种潜在空间方法，将捷径学习视为对捷径对齐轴的过度依赖。在解耦的潜在空间中，我们通过候选轴与标签的强相关性来识别潜在的捷径对齐轴，并在训练过程中注入目标性各向异性噪声以减少分类器对这些轴的依赖。与先前基于潜在空间的方法（通过移除、投影剔除或对抗性抑制捷径特征）不同，我们的方法保留了完整的表示，并通过正则化分类器沿这些轴的敏感性来施加函数不变性。我们证明，注入各向异性噪声会诱导目标雅可比正则化和曲率正则化，从而有效沿捷径轴平坦化决策边界，同时基本不影响核心特征维度。我们的方法在标准捷径学习基准测试中实现了最先进的分布外性能，且无需捷径标签或与捷径冲突的样本。