Neural networks are often biased to spuriously correlated features that provide misleading statistical evidence that does not generalize. This raises an interesting question: ``Does an optimal unbiased functional subnetwork exist in a severely biased network? If so, how to extract such subnetwork?" While empirical evidence has been accumulated about the existence of such unbiased subnetworks, these observations are mainly based on the guidance of ground-truth unbiased samples. Thus, it is unexplored how to discover the optimal subnetworks with biased training datasets in practice. To address this, here we first present our theoretical insight that alerts potential limitations of existing algorithms in exploring unbiased subnetworks in the presence of strong spurious correlations. We then further elucidate the importance of bias-conflicting samples on structure learning. Motivated by these observations, we propose a Debiased Contrastive Weight Pruning (DCWP) algorithm, which probes unbiased subnetworks without expensive group annotations. Experimental results demonstrate that our approach significantly outperforms state-of-the-art debiasing methods despite its considerable reduction in the number of parameters.
翻译:神经网络常常偏向于虚假相关的特征,这些特征提供的误导性统计证据无法泛化。这引发了一个有趣的问题:“在严重偏斜的网络中,是否存在一个最优的无偏函数子网络?如果存在,如何提取这样的子网络?”尽管已有经验证据表明此类无偏子网络的存在,但这些观察主要基于真实无偏样本的引导。因此,在实践中如何利用有偏训练数据集发现最优子网络仍是一个未探索的问题。为解决这一难题,我们首先提出理论见解,警示现有算法在存在强虚假相关性时探索无偏子网络的潜在局限性。随后,我们进一步阐明偏冲突样本对结构学习的重要性。受这些观察启发,我们提出了一种去偏对比权重剪枝(DCWP)算法,该算法无需昂贵的组标注即可探测无偏子网络。实验结果表明,尽管参数数量大幅减少,我们的方法仍显著优于最先进的去偏方法。