The Invisible Gorilla Effect in Out-of-distribution Detection

Deep Neural Networks achieve high performance in vision tasks by learning features from regions of interest (ROI) within images, but their performance degrades when deployed on out-of-distribution (OOD) data that differs from training data. This challenge has led to OOD detection methods that aim to identify and reject unreliable predictions. Although prior work shows that OOD detection performance varies by artefact type, the underlying causes remain underexplored. To this end, we identify a previously unreported bias in OOD detection: for hard-to-detect artefacts (near-OOD), detection performance typically improves when the artefact shares visual similarity (e.g. colour) with the model's ROI and drops when it does not - a phenomenon we term the Invisible Gorilla Effect. For example, in a skin lesion classifier with red lesion ROI, we show the method Mahalanobis Score achieves a 31.5% higher AUROC when detecting OOD red ink (similar to ROI) compared to black ink (dissimilar) annotations. We annotated artefacts by colour in 11,355 images from three public datasets (e.g. ISIC) and generated colour-swapped counterfactuals to rule out dataset bias. We then evaluated 40 OOD methods across 7 benchmarks and found significant performance drops for most methods when artefacts differed from the ROI. Our findings highlight an overlooked failure mode in OOD detection and provide guidance for more robust detectors. Code and annotations are available at: https://github.com/HarryAnthony/Invisible_Gorilla_Effect.

翻译：深度神经网络通过从图像中的感兴趣区域（ROI）学习特征，在视觉任务中取得了高性能，但当部署在与训练数据不同的分布外（OOD）数据上时，其性能会下降。这一挑战催生了旨在识别并拒绝不可靠预测的OOD检测方法。尽管先前的研究表明OOD检测性能因伪影类型而异，但其根本原因仍未得到充分探索。为此，我们发现了OOD检测中一个先前未被报告的偏差：对于难以检测的伪影（近OOD），当伪影与模型的ROI具有视觉相似性（例如颜色）时，检测性能通常会提高；反之，当两者不相似时，性能则会下降——我们将这种现象称为“隐形大猩猩”效应。例如，在一个以红色病变为ROI的皮肤病变分类器中，我们发现Mahalanobis Score方法在检测OOD红色墨水（与ROI相似）注释时，其AUROC比检测黑色墨水（不相似）注释高出31.5%。我们对来自三个公共数据集（例如ISIC）的11,355张图像中的伪影按颜色进行了标注，并生成了颜色交换的反事实图像以排除数据集偏差。随后，我们在7个基准测试上评估了40种OOD方法，发现当伪影与ROI不同时，大多数方法的性能均出现显著下降。我们的研究结果揭示了OOD检测中一个被忽视的失效模式，并为构建更鲁棒的检测器提供了指导。代码与标注可在以下网址获取：https://github.com/HarryAnthony/Invisible_Gorilla_Effect。