This paper presents a novel framework, named Global-Local Correspondence Framework (GLCF), for visual anomaly detection with logical constraints. Visual anomaly detection has become an active research area in various real-world applications, such as industrial anomaly detection and medical disease diagnosis. However, most existing methods focus on identifying local structural degeneration anomalies and often fail to detect high-level functional anomalies that involve logical constraints. To address this issue, we propose a two-branch approach that consists of a local branch for detecting structural anomalies and a global branch for detecting logical anomalies. To facilitate local-global feature correspondence, we introduce a novel semantic bottleneck enabled by the visual Transformer. Moreover, we develop feature estimation networks for each branch separately to detect anomalies. Our proposed framework is validated using various benchmarks, including industrial datasets, Mvtec AD, Mvtec Loco AD, and the Retinal-OCT medical dataset. Experimental results show that our method outperforms existing methods, particularly in detecting logical anomalies.
翻译:本文提出了一种新颖的框架,名为全局-局部对应关系框架(GLCF),用于具有逻辑约束的视觉异常检测。视觉异常检测已在工业异常检测和医学疾病诊断等实际应用中成为活跃的研究领域。然而,现有方法大多侧重于识别局部结构退化异常,往往无法检测涉及逻辑约束的高层次功能异常。为解决这一问题,我们提出了一种双分支方法,包含用于检测结构异常的局部分支和用于检测逻辑异常的全局分支。为促进局部与全局特征之间的对应关系,我们引入了一种基于视觉Transformer的新型语义瓶颈。此外,我们分别为每个分支开发了特征估计网络以检测异常。所提出的框架在多个基准数据集上进行了验证,包括工业数据集、Mvtec AD、Mvtec Loco AD 和视网膜光学相干断层扫描(Retinal-OCT)医学数据集。实验结果表明,我们的方法在检测逻辑异常方面优于现有方法。