Generated synthetic data in medical research can substitute privacy and security-sensitive data with a large-scale curated dataset, reducing data collection and annotation costs. As part of this effort, we propose UniXGen, a unified chest X-ray and report generation model, with the following contributions. First, we design a unified model for bidirectional chest X-ray and report generation by adopting a vector quantization method to discretize chest X-rays into discrete visual tokens and formulating both tasks as sequence generation tasks. Second, we introduce several special tokens to generate chest X-rays with specific views that can be useful when the desired views are unavailable. Furthermore, UniXGen can flexibly take various inputs from single to multiple views to take advantage of the additional findings available in other X-ray views. We adopt an efficient transformer for computational and memory efficiency to handle the long-range input sequence of multi-view chest X-rays with high resolution and long paragraph reports. In extensive experiments, we show that our unified model has a synergistic effect on both generation tasks, as opposed to training only the task-specific models. We also find that view-specific special tokens can distinguish between different views and properly generate specific views even if they do not exist in the dataset, and utilizing multi-view chest X-rays can faithfully capture the abnormal findings in the additional X-rays. The source code is publicly available at: https://github.com/ttumyche/UniXGen.
翻译:在医学研究中,生成的合成数据可用大规模精选数据集替代涉及隐私和安全敏感的数据,从而降低数据收集和标注成本。作为此项工作的一部分,我们提出UniXGen——一种统一的胸部X光与报告生成模型,具体贡献如下。首先,我们通过采用向量量化方法将胸部X光图像离散化为视觉标记,并将两项任务均建模为序列生成任务,从而设计了一个用于双向胸部X光与报告生成的统一模型。其次,我们引入了若干特殊标记,用于生成特定视角的胸部X光图像,这在所需视角缺失时尤为实用。此外,UniXGen可灵活接受从单视角到多视角的多种输入,从而充分利用其他X光视角中存在的附加发现。我们采用高效的Transformer架构,以提升计算与内存效率,从而处理多视角高分辨率胸部X光图像与长段落报告组成的长程输入序列。通过大量实验,我们证明相较于仅训练任务专用模型,我们的统一模型对两项生成任务均具有协同效应。我们还发现,视角专用特殊标记能够区分不同视角,并能在数据集中不存在特定视角时正确生成该视角图像,而利用多视角胸部X光可准确捕捉额外X光中的异常发现。源代码已公开于:https://github.com/ttumyche/UniXGen。