Unsupervised Contrastive Analysis for Salient Pattern Detection using Conditional Diffusion Models

Contrastive Analysis (CA) regards the problem of identifying patterns in images that allow distinguishing between a background (BG) dataset (i.e. healthy subjects) and a target (TG) dataset (i.e. unhealthy subjects). Recent works on this topic rely on variational autoencoders (VAE) or contrastive learning strategies to learn the patterns that separate TG samples from BG samples in a supervised manner. However, the dependency on target (unhealthy) samples can be challenging in medical scenarios due to their limited availability. Also, the blurred reconstructions of VAEs lack utility and interpretability. In this work, we redefine the CA task by employing a self-supervised contrastive encoder to learn a latent representation encoding only common patterns from input images, using samples exclusively from the BG dataset during training, and approximating the distribution of the target patterns by leveraging data augmentation techniques. Subsequently, we exploit state-of-the-art generative methods, i.e. diffusion models, conditioned on the learned latent representation to produce a realistic (healthy) version of the input image encoding solely the common patterns. Thorough validation on a facial image dataset and experiments across three brain MRI datasets demonstrate that conditioning the generative process of state-of-the-art generative methods with the latent representation from our self-supervised contrastive encoder yields improvements in the generated image quality and in the accuracy of image classification. The code is available at https://github.com/CristianoPatricio/unsupervised-contrastive-cond-diff.

翻译：对比分析旨在识别图像中能够区分背景数据集（即健康受试者）与目标数据集（即非健康受试者）的模式。该领域的近期研究依赖于变分自编码器或对比学习策略，以有监督的方式学习区分目标样本与背景样本的模式。然而，在医学场景中，由于目标（非健康）样本的有限可用性，对其的依赖可能带来挑战。此外，VAE的模糊重建缺乏实用性和可解释性。在本工作中，我们通过采用自监督对比编码器来重新定义CA任务：该编码器仅从输入图像中学习编码共有模式的潜在表示，训练过程中仅使用背景数据集样本，并利用数据增强技术来近似目标模式的分布。随后，我们利用最先进的生成方法（即扩散模型），以学习到的潜在表示为条件，生成仅编码共有模式的输入图像的真实（健康）版本。在面部图像数据集上的全面验证以及在三个脑部MRI数据集上的实验表明，将最先进生成方法的生成过程与我们自监督对比编码器的潜在表示相条件化，能够提升生成图像的质量和图像分类的准确性。代码可在 https://github.com/CristianoPatricio/unsupervised-contrastive-cond-diff 获取。