Inspired by recent findings that generative diffusion models learn semantically meaningful representations, we use them to discover the intrinsic hierarchical structure in biomedical 3D images using unsupervised segmentation. We show that features of diffusion models from different stages of a U-Net-based ladder-like architecture capture different hierarchy levels in 3D biomedical images. We design three losses to train a predictive unsupervised segmentation network that encourages the decomposition of 3D volumes into meaningful nested subvolumes that represent a hierarchy. First, we pretrain 3D diffusion models and use the consistency of their features across subvolumes. Second, we use the visual consistency between subvolumes. Third, we use the invariance to photometric augmentations as a regularizer. Our models achieve better performance than prior unsupervised structure discovery approaches on challenging biologically-inspired synthetic datasets and on a real-world brain tumor MRI dataset.
翻译:受生成扩散模型学习到语义上有意义的表征这一最新发现的启发,我们利用这些模型通过无监督分割来发现生物医学三维图像中的内在层次结构。我们表明,基于U-Net阶梯式架构的扩散模型在不同阶段提取的特征能够捕捉三维生物医学图像中不同层次的层级信息。我们设计了三种损失函数来训练一个预测性无监督分割网络,该网络鼓励将三维体数据分解为表示层次结构的有意义嵌套子体。首先,我们预训练三维扩散模型并利用其跨子体特征的一致性;其次,利用子体间的视觉一致性;第三,使用对光度增强的不变性作为正则化项。我们的模型在具有挑战性的生物启发合成数据集和真实脑肿瘤MRI数据集上,性能优于先前无监督结构发现方法。