MeDUET: Disentangled Unified Pretraining for 3D Medical Image Synthesis and Analysis

Self-supervised learning (SSL) and diffusion models have advanced representation learning and image synthesis, but in 3D medical imaging they are still largely used separately for analysis and synthesis, respectively. Unifying them is appealing but difficult, because multi-source data exhibit pronounced style shifts while downstream tasks rely primarily on anatomy, causing anatomical content and acquisition style to become entangled. In this paper, we propose MeDUET, a 3D Medical image Disentangled UnifiEd PreTraining framework in the variational autoencoder latent space. Our central idea is to treat unified pretraining under heterogeneous multi-center data as a factor identifiability problem, where content should consistently capture anatomy and style should consistently capture appearance. MeDUET addresses this problem through three components. Token demixing provides controllable supervision for factor separation, Mixed Factor Token Distillation reduces factor leakage under mixed regions, and Swap-invariance Quadruplet Contrast promotes factor-wise invariance and discriminability. With these learned factors, MeDUET transfers effectively to both synthesis and analysis, yielding higher fidelity, faster convergence, and better controllability for synthesis, while achieving competitive or superior domain generalization and label efficiency on diverse medical benchmarks. Overall, MeDUET shows that multi-source heterogeneity can serve as useful supervision, with disentanglement providing an effective interface for unifying 3D medical image synthesis and analysis. Our code is available at https://github.com/JK-Liu7/MeDUET.

翻译：自监督学习与扩散模型推动了表示学习与图像合成的发展，但在3D医学影像领域，两者仍分别主要应用于分析任务与合成任务。将两者统一具有重要价值却面临困难，原因在于多源数据呈现显著风格偏移，而下游任务主要依赖解剖结构，导致解剖内容与采集风格相互纠缠。本文提出MeDUET——面向3D医学图像的变分自编码器隐空间解耦统一预训练框架。我们的核心思想是将多中心异质数据上的统一预训练视为因子可识别性问题，其中内容因子应一致捕获解剖特征，风格因子应一致捕获表观特征。MeDUET通过三个组件解决该问题：令牌解混为因子分离提供可控监督，混合因子令牌蒸馏减少混合区域下的因子泄露，交换不变四元组对比促进因子间不变性与判别性。基于这些学习到的因子，MeDUET有效迁移至合成与分析任务，在合成中实现更高保真度、更快收敛与更强可控性，同时在多样医学基准上获得具有竞争力或更优的域泛化与标签效率。总体而言，MeDUET表明多源异质性可作为有效监督，解耦为统一3D医学图像合成与分析提供了有效接口。代码已开源：https://github.com/JK-Liu7/MeDUET。