RadRotator: 3D Rotation of Radiographs with Diffusion Models

from arxiv, Website: https://pouriarouzrokh.github.io/RadRotator Online demo: https://huggingface.co/spaces/Pouriarouzrokh/RadRotator Article information: 16 pages, 11 figures

Transforming two-dimensional (2D) images into three-dimensional (3D) volumes is a well-known yet challenging problem for the computer vision community. In the medical domain, a few previous studies attempted to convert two or more input radiographs into computed tomography (CT) volumes. Following their effort, we introduce a diffusion model-based technology that can rotate the anatomical content of any input radiograph in 3D space, potentially enabling the visualization of the entire anatomical content of the radiograph from any viewpoint in 3D. Similar to previous studies, we used CT volumes to create Digitally Reconstructed Radiographs (DRRs) as the training data for our model. However, we addressed two significant limitations encountered in previous studies: 1. We utilized conditional diffusion models with classifier-free guidance instead of Generative Adversarial Networks (GANs) to achieve higher mode coverage and improved output image quality, with the only trade-off being slower inference time, which is often less critical in medical applications; and 2. We demonstrated that the unreliable output of style transfer deep learning (DL) models, such as Cycle-GAN, to transfer the style of actual radiographs to DRRs could be replaced with a simple yet effective training transformation that randomly changes the pixel intensity histograms of the input and ground-truth imaging data during training. This transformation makes the diffusion model agnostic to any distribution variations of the input data pixel intensity, enabling the reliable training of a DL model on input DRRs and applying the exact same model to conventional radiographs (or DRRs) during inference.

翻译：摘要：将二维图像转化为三维体数据是计算机视觉领域公认的挑战性难题。在医学领域，已有少数研究尝试将多张输入放射影像转换为计算机断层扫描（CT）体数据。基于这些研究，我们提出了一种基于扩散模型的技术，能够对任意输入放射影像的解剖结构进行三维空间旋转，从而有望实现从任意三维视角观察放射影像的完整解剖内容。与既往研究类似，我们采用CT体数据生成数字重建放射影像（DRR）作为模型训练数据。但本研究针对既往研究的两项局限进行了突破：1)采用无分类器引导的条件扩散模型替代生成对抗网络（GAN），在仅牺牲推理速度（这在医学应用中通常非关键因素）的条件下，实现了更高的模式覆盖率和输出图像质量；2)证明现有风格迁移深度学习模型（如Cycle-GAN）将真实放射影像风格迁移至DRR时存在不可靠性，因此提出一种简洁有效的训练变换方法——在训练过程中随机改变输入数据与真值成像数据的像素强度直方图。该变换使扩散模型对输入数据像素强度的分布变化不敏感，从而可基于DRR输入训练深度学习模型，并在推理阶段将同一模型直接应用于常规放射影像（或DRR）的可靠处理。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日