Traditional deep learning models often lack annotated data, especially in cross-domain applications such as anomaly detection, which is critical for early disease diagnosis in medicine and defect detection in industry. To address this challenge, we propose Multi-AD, a convolutional neural network (CNN) model for robust unsupervised anomaly detection across medical and industrial images. Our approach employs the squeeze-and-excitation (SE) block to enhance feature extraction via channel-wise attention, enabling the model to focus on the most relevant features and detect subtle anomalies. Knowledge distillation (KD) transfers informative features from the teacher to the student model, enabling effective learning of the differences between normal and anomalous data. Then, the discriminator network further enhances the model's capacity to distinguish between normal and anomalous data. At the inference stage, by integrating multi-scale features, the student model can detect anomalies of varying sizes. The teacher-student (T-S) architecture ensures consistent representation of high-dimensional features while adapting them to enhance anomaly detection. Multi-AD was evaluated on several medical datasets, including brain MRI, liver CT, and retina OCT, as well as industrial datasets, such as MVTec AD, demonstrating strong generalization across multiple domains. Experimental results demonstrated that our approach consistently outperformed state-of-the-art models, achieving the best average AUROC for both image-level (81.4% for medical and 99.6% for industrial) and pixel-level (97.0% for medical and 98.4% for industrial) tasks, making it effective for real-world applications.
翻译:传统深度学习模型通常缺乏标注数据,这在跨域应用(如异常检测)中尤为突出,而异常检测对于医学中的早期疾病诊断和工业中的缺陷检测至关重要。为应对这一挑战,我们提出了Multi-AD,一种用于医疗与工业图像的鲁棒性跨域无监督异常检测的卷积神经网络(CNN)模型。我们的方法采用压缩与激励(SE)模块,通过通道注意力机制增强特征提取,使模型能够聚焦于最相关的特征并检测细微异常。知识蒸馏(KD)将信息丰富的特征从教师模型传递至学生模型,从而有效学习正常与异常数据之间的差异。随后,判别器网络进一步增强了模型区分正常与异常数据的能力。在推理阶段,通过集成多尺度特征,学生模型能够检测不同尺寸的异常。教师-学生(T-S)架构确保了高维特征表示的一致性,同时通过自适应调整以增强异常检测性能。Multi-AD在多个医疗数据集(包括脑部MRI、肝脏CT和视网膜OCT)以及工业数据集(如MVTec AD)上进行了评估,展现了跨多个领域的强大泛化能力。实验结果表明,我们的方法在图像级(医疗81.4%,工业99.6%)和像素级(医疗97.0%,工业98.4%)任务上均持续优于现有最先进模型,取得了最佳平均AUROC,证明了其在现实应用中的有效性。