Radiology reports are crucial for planning treatment strategies and enhancing doctor-patient communication, yet manually writing these reports is burdensome for radiologists. While automatic report generation offers a solution, existing methods often rely on single-view radiographs, limiting diagnostic accuracy. To address this problem, we propose MCL, a Multi-view enhanced Contrastive Learning method for chest X-ray report generation. Specifically, we first introduce multi-view enhanced contrastive learning for visual representation by maximizing agreements between multi-view radiographs and their corresponding report. Subsequently, to fully exploit patient-specific indications (e.g., patient's symptoms) for report generation, we add a transitional ``bridge" for missing indications to reduce embedding space discrepancies caused by their presence or absence. Additionally, we construct Multi-view CXR and Two-view CXR datasets from public sources to support research on multi-view report generation. Our proposed MCL surpasses recent state-of-the-art methods across multiple datasets, achieving a 5.0% F1 RadGraph improvement on MIMIC-CXR, a 7.3% BLEU-1 improvement on MIMIC-ABN, a 3.1% BLEU-4 improvement on Multi-view CXR, and an 8.2% F1 CheXbert improvement on Two-view CXR.
翻译:放射学报告对于制定治疗策略和改善医患沟通至关重要,但手动撰写这些报告对放射科医生而言是一项繁重负担。尽管自动报告生成提供了一种解决方案,但现有方法通常依赖于单视图X光片,限制了诊断准确性。为解决这一问题,我们提出MCL,一种用于胸部X光报告生成的多视图增强对比学习方法。具体而言,我们首先通过最大化多视图X光片与其对应报告之间的一致性,引入多视图增强对比学习以优化视觉表示。随后,为充分利用患者特定指征(如患者症状)进行报告生成,我们为缺失指征添加了一个过渡性“桥梁”,以减少因指征存在或缺失导致的嵌入空间差异。此外,我们从公开数据源构建了多视图CXR和双视图CXR数据集,以支持多视图报告生成研究。我们提出的MCL在多个数据集上超越了近期最先进方法,在MIMIC-CXR上实现了5.0%的RadGraph F1分数提升,在MIMIC-ABN上实现了7.3%的BLEU-1分数提升,在多视图CXR上实现了3.1%的BLEU-4分数提升,在双视图CXR上实现了8.2%的CheXbert F1分数提升。