Often machine learning models tend to automatically learn associations present in the training data without questioning their validity or appropriateness. This undesirable property is the root cause of the manifestation of spurious correlations, which render models unreliable and prone to failure in the presence of distribution shifts. Research shows that most methods attempting to remedy spurious correlations are only effective for a model's known spurious associations. Current spurious correlation detection algorithms either rely on extensive human annotations or are too restrictive in their formulation. Moreover, they rely on strict definitions of visual artifacts that may not apply to data produced by generative models, as they are known to hallucinate contents that do not conform to standard specifications. In this work, we introduce a general-purpose method that efficiently detects potential spurious correlations, and requires significantly less human interference in comparison to the prior art. Additionally, the proposed method provides intuitive explanations while eliminating the need for pixel-level annotations. We demonstrate the proposed method's tolerance to the peculiarity of AI-generated images, which is a considerably challenging task, one where most of the existing methods fall short. Consequently, our method is also suitable for detecting spurious correlations that may propagate to downstream applications originating from generative models.
翻译:机器学习模型往往倾向于自动学习训练数据中的关联性,而不会质疑其有效性或适当性。这种不良特性是虚假相关性产生的根本原因,导致模型在分布偏移时不可靠且容易失效。研究表明,大多数试图修正虚假相关性的方法仅对模型已知的虚假关联有效。当前的虚假相关性检测算法要么依赖大量人工标注,要么在形式化上过于严格。此外,这些方法依赖于对视觉伪影的严格定义,而生成模型产生的数据因已知会产生不符合标准规范的幻觉内容,故此类定义可能不适用。本文提出了一种通用方法,能够高效检测潜在的虚假相关性,且与现有技术相比,所需的人工干预显著减少。同时,该方法提供直观的解释,无需像素级标注。我们展示了该方法对AI生成图像特异性的容忍度——这是一项极具挑战性的任务,现有大多数方法在此任务中表现不足。因此,我们的方法也适用于检测可能从生成模型向下游应用传播的虚假相关性。