Right Regions, Wrong Labels: Semantic Label Flips in Segmentation under Correlation Shift

The robustness of machine learning models can be compromised by spurious correlations between non-causal features in the input data and target labels. A common way to test for such correlations is to train on data where the label is strongly tied to some non-causal cue, then evaluate on examples where that tie no longer holds. This idea is well established for classification tasks, but for semantic segmentation the specific failure modes are not well understood. We show that a model may achieve reasonable overlap while assigning the wrong semantic label, swapping one plausible foreground class for another, even when object boundaries are largely correct. We focus on this semantic label-flip behaviour and quantify it with a simple diagnostic (Flip) that counts how often ground truth foreground pixels are assigned the wrong foreground identity while remaining predicted as foreground. In a setting where category and scene are correlated during training, increasing the correlation consistently widens the gap between common and rare test conditions and increases these within-object label swaps on counterfactual groups. Overall, our results motivate assessing segmentation robustness under distribution shift beyond overlap by decomposing foreground errors into correct pixels, flipped-identity pixels, and missed-to-background pixels. We also propose an entropy-based, ground truth label-free `flip-risk' score, which is computed from foreground identity uncertainty, and show that it can flag flip-prone cases at inference time. Code is available at https://github.com/acharaakshit/label-flips.

翻译：机器学习模型的鲁棒性可能因输入数据中的非因果特征与目标标签之间的虚假相关性而受损。检验此类相关性的常见做法是：在标签与非因果线索紧密关联的数据上训练模型，随后在解除该关联的样本上进行评估。这一思路在分类任务中已得到充分验证，但在语义分割领域，其具体失效模式尚未被充分理解。我们证明，即使目标边界大致正确，模型仍可能实现合理的重叠率但分配错误的语义标签，将某一合理的前景类别替换为另一类别。我们聚焦于这种语义标签翻转行为，并通过一个简单诊断指标（Flip）量化该行为——该指标统计真实前景像素被错误分配其他前景身份但仍被预测为前景的次数。在训练阶段类别与场景存在相关性的设定中，增强相关性会持续扩大常见测试条件与罕见测试条件之间的差距，并增加反事实组内对象的标签交换。总体而言，我们的研究通过将前景错误分解为正确像素、身份翻转像素和漏检为背景像素，论证了在分布偏移下评估分割鲁棒性时超越重叠率的必要性。我们还提出一种基于熵的、无需真实标签的“翻转风险”评分，该评分从前景身份不确定性中计算得出，并证明其能在推理阶段标记易翻转案例。代码开源地址为 https://github.com/acharaakshit/label-flips。