We propose a deep mixture of multimodal hierarchical variational auto-encoders called MMHVAE that synthesizes missing images from observed images in different modalities. MMHVAE's design focuses on tackling four challenges: (i) creating a complex latent representation of multimodal data to generate high-resolution images; (ii) encouraging the variational distributions to estimate the missing information needed for cross-modal image synthesis; (iii) learning to fuse multimodal information in the context of missing data; (iv) leveraging dataset-level information to handle incomplete data sets at training time. Extensive experiments are performed on the challenging problem of pre-operative brain multi-parametric magnetic resonance and intra-operative ultrasound imaging.
翻译:我们提出了一种名为MMHVAE的多模态层次化变分自编码器深度混合模型,用于从不同模态的观测图像中合成缺失图像。MMHVAE的设计重点解决四个挑战:(i) 构建多模态数据的复杂潜在表示以生成高分辨率图像;(ii) 促使变分分布估计跨模态图像合成所需的缺失信息;(iii) 在缺失数据背景下学习多模态信息融合;(iv) 利用数据集级信息处理训练时的不完整数据集。我们在术前脑部多参数磁共振与术中超声成像这一挑战性问题上进行了大量实验。