Industrial anomaly detection demands precise reasoning over fine-grained defect patterns. However, existing multimodal large language models (MLLMs), pretrained on general-domain data, often struggle to capture category-specific anomalies, thereby limiting both detection accuracy and interpretability. To address these limitations, we propose Reason-IAD, a knowledge-guided dynamic latent reasoning framework for explainable industrial anomaly detection. Reason-IAD comprises two core components. First, a retrieval-augmented knowledge module incorporates category-specific textual descriptions into the model input, enabling context-aware reasoning over domain-specific defects. Second, an entropy-driven latent reasoning mechanism conducts iterative exploration within a compact latent space using optimizable latent think tokens, guided by an entropy-based reward that encourages confident and stable predictions. Furthermore, a dynamic visual injection strategy selectively incorporates the most informative image patches into the latent sequence, directing the reasoning process toward regions critical for anomaly detection. Extensive experimental results demonstrate that Reason-IAD consistently outperforms state-of-the-art methods. The code will be publicly available at https://github.com/chenpeng052/Reason-IAD.
翻译:工业异常检测需要对细粒度缺陷模式进行精确推理。然而,现有基于通用领域数据预训练的多模态大语言模型(MLLMs)往往难以捕捉特定类别的异常,从而限制了检测精度与可解释性。为应对这些局限,本文提出Reason-IAD——一种面向可解释工业异常检测的知识引导动态隐式推理框架。Reason-IAD包含两个核心组件。首先,检索增强知识模块将特定类别的文本描述融入模型输入,实现对领域特定缺陷的上下文感知推理。其次,熵驱动隐式推理机制在紧凑的隐空间内,利用可优化的隐式思维令牌进行迭代探索,该过程受基于熵的奖励引导,以鼓励生成置信度高且稳定的预测。此外,动态视觉注入策略选择性地将信息量最大的图像块整合到隐式序列中,将推理过程导向对异常检测至关重要的区域。大量实验结果表明,Reason-IAD在多个基准测试中持续优于现有最先进方法。代码将在https://github.com/chenpeng052/Reason-IAD公开。