Transforming two-dimensional (2D) images into three-dimensional (3D) volumes is a well-known yet challenging problem for the computer vision community. In the medical domain, a few previous studies attempted to convert two or more input radiographs into computed tomography (CT) volumes. Following their effort, we introduce a diffusion model-based technology that can rotate the anatomical content of any input radiograph in 3D space, potentially enabling the visualization of the entire anatomical content of the radiograph from any viewpoint in 3D. Similar to previous studies, we used CT volumes to create Digitally Reconstructed Radiographs (DRRs) as the training data for our model. However, we addressed two significant limitations encountered in previous studies: 1. We utilized conditional diffusion models with classifier-free guidance instead of Generative Adversarial Networks (GANs) to achieve higher mode coverage and improved output image quality, with the only trade-off being slower inference time, which is often less critical in medical applications; and 2. We demonstrated that the unreliable output of style transfer deep learning (DL) models, such as Cycle-GAN, to transfer the style of actual radiographs to DRRs could be replaced with a simple yet effective training transformation that randomly changes the pixel intensity histograms of the input and ground-truth imaging data during training. This transformation makes the diffusion model agnostic to any distribution variations of the input data pixel intensity, enabling the reliable training of a DL model on input DRRs and applying the exact same model to conventional radiographs (or DRRs) during inference.
翻译:摘要:将二维图像转化为三维体数据是计算机视觉领域公认的挑战性难题。在医学领域,已有少数研究尝试将多张输入放射影像转换为计算机断层扫描(CT)体数据。基于这些研究,我们提出了一种基于扩散模型的技术,能够对任意输入放射影像的解剖结构进行三维空间旋转,从而有望实现从任意三维视角观察放射影像的完整解剖内容。与既往研究类似,我们采用CT体数据生成数字重建放射影像(DRR)作为模型训练数据。但本研究针对既往研究的两项局限进行了突破:1)采用无分类器引导的条件扩散模型替代生成对抗网络(GAN),在仅牺牲推理速度(这在医学应用中通常非关键因素)的条件下,实现了更高的模式覆盖率和输出图像质量;2)证明现有风格迁移深度学习模型(如Cycle-GAN)将真实放射影像风格迁移至DRR时存在不可靠性,因此提出一种简洁有效的训练变换方法——在训练过程中随机改变输入数据与真值成像数据的像素强度直方图。该变换使扩散模型对输入数据像素强度的分布变化不敏感,从而可基于DRR输入训练深度学习模型,并在推理阶段将同一模型直接应用于常规放射影像(或DRR)的可靠处理。