无需训练的异常生成：基于扩散模型的双注意力增强机制 (Training-Free Anomaly Generation via Dual-Attention Enhancement in Diffusion Model)

Industrial anomaly detection (AD) plays a significant role in manufacturing where a long-standing challenge is data scarcity. A growing body of works have emerged to address insufficient anomaly data via anomaly generation. However, these anomaly generation methods suffer from lack of fidelity or need to be trained with extra data. To this end, we propose a training-free anomaly generation framework dubbed AAG, which is based on Stable Diffusion (SD)'s strong generation ability for effective anomaly image generation. Given a normal image, mask and a simple text prompt, AAG can generate realistic and natural anomalies in the specific regions and simultaneously keep contents in other regions unchanged. In particular, we propose Cross-Attention Enhancement (CAE) to re-engineer the cross-attention mechanism within Stable Diffusion based on the given mask. CAE increases the similarity between visual tokens in specific regions and text embeddings, which guides these generated visual tokens in accordance with the text description. Besides, generated anomalies need to be more natural and plausible with object in given image. We propose Self-Attention Enhancement (SAE) which improves similarity between each normal visual token and anomaly visual tokens. SAE ensures that generated anomalies are coherent with original pattern. Extensive experiments on MVTec AD and VisA datasets demonstrate effectiveness of AAG in anomaly generation and its utility. Furthermore, anomaly images generated by AAG can bolster performance of various downstream anomaly inspection tasks.

翻译：工业异常检测在制造业中具有重要作用，其长期面临的挑战是数据稀缺。已有大量研究通过异常生成方法来应对异常数据不足的问题。然而，现有异常生成方法存在保真度不足或需要额外数据训练的问题。为此，我们提出一种无需训练的异常生成框架AAG，该框架基于Stable Diffusion强大的生成能力实现有效的异常图像生成。给定正常图像、掩码和简单文本提示，AAG能够在指定区域生成逼真自然的异常，同时保持其他区域内容不变。具体而言，我们提出交叉注意力增强机制，基于给定掩码对Stable Diffusion中的交叉注意力机制进行重构。该机制通过提升特定区域视觉标记与文本嵌入之间的相似度，引导生成的视觉标记符合文本描述。此外，为使得生成异常与给定图像中的物体更自然协调，我们提出自注意力增强机制，通过提高每个正常视觉标记与异常视觉标记之间的相似度，确保生成异常与原始模式保持连贯性。在MVTec AD和VisA数据集上的大量实验证明了AAG在异常生成方面的有效性及其应用价值。此外，由AAG生成的异常图像能够有效提升多种下游异常检测任务的性能。