Distribution shifts are common in real-world datasets and can affect the performance and reliability of deep learning models. In this paper, we study two types of distribution shifts: diversity shifts, which occur when test samples exhibit patterns unseen during training, and correlation shifts, which occur when test data present a different correlation between seen invariant and spurious features. We propose an integrated protocol to analyze both types of shifts using datasets where they co-exist in a controllable manner. Finally, we apply our approach to a real-world classification problem of skin cancer analysis, using out-of-distribution datasets and specialized bias annotations. Our protocol reveals three findings: 1) Models learn and propagate correlation shifts even with low-bias training; this poses a risk of accumulating and combining unaccountable weak biases; 2) Models learn robust features in high- and low-bias scenarios but use spurious ones if test samples have them; this suggests that spurious correlations do not impair the learning of robust features; 3) Diversity shift can reduce the reliance on spurious correlations; this is counter intuitive since we expect biased models to depend more on biases when invariant features are missing. Our work has implications for distribution shift research and practice, providing new insights into how models learn and rely on spurious correlations under different types of shifts.
翻译:现实世界数据集常存在分布偏移,可能影响深度学习模型的性能与可靠性。本文研究了两种分布偏移类型:多样性偏移(测试样本呈现训练阶段未见的模式)与相关性偏移(测试数据中已见不变特征与虚假特征间的关联性发生改变)。我们提出了一种综合评估协议,通过可控方式使两种偏移共存的数据集分析其各自影响。最后,将该方法应用于皮肤癌分析这一真实分类问题,采用分布外数据集及专项偏差标注。实验揭示三点发现:1)即使训练阶段偏差微小,模型仍会习得并传播相关性偏移——这可能导致不可控弱偏差的累积与组合风险;2)模型在高/低偏差场景中均能学习鲁棒特征,但若测试样本存在虚假特征则会加以利用——表明虚假相关性并不阻碍鲁棒特征的学习;3)多样性偏移会降低模型对虚假相关性的依赖——这一反直觉现象说明,当不变特征缺失时,预期中依赖偏差的模型反而可能减少对偏差的依赖。本研究为分布偏移研究与实践提供了新视角,揭示了不同偏移类型下模型学习及依赖虚假相关性的内在机制。