利用临床文本与类别条件进行三维前列腺MRI生成 (Leveraging Clinical Text and Class Conditioning for 3D Prostate MRI Generation)

from arxiv, Accepted for publication in IEEE Transactions on Biomedical Engineering, 2025. This is the accepted author version. The final published version is available at https://doi.org/10.1109/TBME.2025.3648426

Objective: Latent diffusion models (LDM) could alleviate data scarcity challenges affecting machine learning development for medical imaging. However, medical LDM strategies typically rely on short-prompt text encoders, nonmedical LDMs, or large data volumes. These strategies can limit performance and scientific accessibility. We propose a novel LDM conditioning approach to address these limitations. Methods: We propose Class-Conditioned Efficient Large Language model Adapter (CCELLA), a novel dual-head conditioning approach that simultaneously conditions the LDM U-Net with free-text clinical reports and radiology classification. We also propose a data-efficient LDM pipeline centered around CCELLA and a proposed joint loss function. We first evaluate our method on 3D prostate MRI against state-of-the-art. We then augment a downstream classifier model training dataset with synthetic images from our method. Results: Our method achieves a 3D FID score of 0.025 on a size-limited 3D prostate MRI dataset, significantly outperforming a recent foundation model with FID 0.070. When training a classifier for prostate cancer prediction, adding synthetic images generated by our method during training improves classifier accuracy from 69% to 74% and outperforms classifiers trained on images generated by prior state-of-the-art. Classifier training solely on our method's synthetic images achieved comparable performance to real image training. Conclusion: We show that our method improved both synthetic image quality and downstream classifier performance using limited data and minimal human annotation. Significance: The proposed CCELLA-centric pipeline enables radiology report and class-conditioned LDM training for high-quality medical image synthesis given limited data volume and human data annotation, improving LDM performance and scientific accessibility.

翻译：目的：潜在扩散模型（LDM）可缓解医学影像机器学习开发中面临的数据稀缺挑战。然而，医学LDM策略通常依赖短提示文本编码器、非医学LDM或大规模数据量。这些策略可能限制性能与科学可及性。我们提出一种新颖的LDM条件调节方法以应对这些局限。方法：我们提出类别条件高效大语言模型适配器（CCELLA），这是一种新颖的双头条件调节方法，可同时利用自由文本临床报告与放射学分类对LDM U-Net进行条件调节。我们还提出一个以CCELLA为核心的数据高效LDM流程及联合损失函数。我们首先在三维前列腺MRI数据上评估本方法并与前沿技术对比。随后使用本方法生成的合成图像扩增下游分类器模型的训练数据集。结果：在规模受限的三维前列腺MRI数据集上，本方法获得0.025的3D FID分数，显著优于近期基础模型的0.070 FID分数。在前列腺癌预测分类器训练中，加入本方法生成的合成图像使分类器准确率从69%提升至74%，且优于基于先前前沿技术生成图像训练的分类器。仅使用本方法合成图像训练的分类器取得了与真实图像训练相当的性能。结论：我们证明本方法在有限数据和最小人工标注条件下，同时提升了合成图像质量与下游分类器性能。意义：所提出的以CCELLA为核心的流程，能够在有限数据量和人工标注条件下，实现基于放射学报告与类别条件调节的LDM训练，用于高质量医学图像合成，从而提升LDM性能与科学可及性。

相关内容

分类器

关注 6

分类是数据挖掘的一种非常重要的方法。分类的概念是在已有数据的基础上学会一个分类函数或构造出一个分类模型（即我们通常所说的分类器(Classifier)）。该函数或模型能够把数据库中的数据纪录映射到给定类别中的某一个，从而可以应用于数据预测。总之，分类器是数据挖掘中对样本进行分类的方法的统称，包含决策树、逻辑回归、朴素贝叶斯、神经网络等算法。

DeepSeek模型综述：V1 V2 V3 R1-Zero

专知会员服务

116+阅读 · 2025年2月11日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【Erik J Bekkers博士论文】SE(2)中基于亚黎曼几何的视网膜图像分析，Retinal Image Analysis using Sub-Riemannian Geometry in SE(2)

专知会员服务

13+阅读 · 2022年3月27日

【KDD2020】图神经网络生成式预训练，GPT-GNN: Generative Pre-Training of Graph Neural Networks

专知会员服务

99+阅读 · 2020年7月3日