Computed Tomography (CT) scans provide detailed and accurate information of internal structures in the body. They are constructed by sending x-rays through the body from different directions and combining this information into a three-dimensional volume. Such volumes can then be used to diagnose a wide range of conditions and allow for volumetric measurements of organs. In this work, we tackle the problem of reconstructing CT images from biplanar x-rays only. X-rays are widely available and even if the CT reconstructed from these radiographs is not a replacement of a complete CT in the diagnostic setting, it might serve to spare the patients from radiation where a CT is only acquired for rough measurements such as determining organ size. We propose a novel method based on the transformer architecture, by framing the underlying task as a language translation problem. Radiographs and CT images are first embedded into latent quantized codebook vectors using two different autoencoder networks. We then train a GPT model, to reconstruct the codebook vectors of the CT image, conditioned on the codebook vectors of the x-rays and show that this approach leads to realistic looking images. To encourage further research in this direction, we make our code publicly available on GitHub: XXX.
翻译:计算机断层扫描(CT)能够提供身体内部结构的详细而精确的信息。它通过从不同方向向身体发射X射线,并将这些信息组合成三维体素来构建。这些体素可用于诊断多种疾病,并实现器官的容积测量。本研究旨在解决仅从双平面X线片重建CT图像的问题。X射线设备广泛可用,尽管由这些平片重建的CT在诊断场景中无法完全替代完整CT,但它或许能在仅需粗略测量(如确定器官大小)而进行CT扫描时,减少患者所受辐射。我们提出了一种基于Transformer架构的新方法,将底层任务视为语言翻译问题。首先,利用两个不同的自编码器网络将X线片和CT图像嵌入到潜量化码本向量中。然后,我们训练一个GPT模型,以X射线的码本向量为条件,重建CT图像的码本向量,并证明该方法能够生成逼真的图像。为促进该方向的进一步研究,我们在GitHub上公开了代码:XXX。