Diffusion Probabilistic Models have recently shown remarkable performance in generative image modeling, attracting significant attention in the computer vision community. However, while a substantial amount of diffusion-based research has focused on generative tasks, few studies have applied diffusion models to general medical image classification. In this paper, we propose the first diffusion-based model (named DiffMIC) to address general medical image classification by eliminating unexpected noise and perturbations in medical images and robustly capturing semantic representation. To achieve this goal, we devise a dual conditional guidance strategy that conditions each diffusion step with multiple granularities to improve step-wise regional attention. Furthermore, we propose learning the mutual information in each granularity by enforcing Maximum-Mean Discrepancy regularization during the diffusion forward process. We evaluate the effectiveness of our DiffMIC on three medical classification tasks with different image modalities, including placental maturity grading on ultrasound images, skin lesion classification using dermatoscopic images, and diabetic retinopathy grading using fundus images. Our experimental results demonstrate that DiffMIC outperforms state-of-the-art methods by a significant margin, indicating the universality and effectiveness of the proposed model. Our code will be publicly available at https://github.com/scott-yjyang/DiffMIC.
翻译:扩散概率模型近期在生成式图像建模中展现出卓越性能,引起了计算机视觉领域的广泛关注。然而,尽管大量基于扩散的研究聚焦于生成任务,目前鲜有研究将扩散模型应用于通用医学图像分类。本文提出首个基于扩散的模型(命名为DiffMIC),通过消除医学图像中的意外噪声和扰动,并稳健捕捉语义表征,来解决通用医学图像分类问题。为实现该目标,我们设计了一种双条件引导策略,该策略以多粒度条件约束每个扩散步骤,从而增强逐步骤的区域注意力。此外,我们提出通过在前向扩散过程中强制执行最大均值差异正则化,学习每个粒度中的互信息。我们在包含不同图像模态的三项医学分类任务上评估了DiffMIC的有效性,包括超声图像的胎盘成熟度分级、皮肤镜图像的皮肤病变分类以及眼底图像的糖尿病视网膜病变分级。实验结果表明,DiffMIC以显著优势超越当前最优方法,验证了所提模型的通用性和有效性。我们的代码将公开于 https://github.com/scott-yjyang/DiffMIC。