Radiology reporting is a crucial part of the communication between radiologists and other medical professionals, but it can be time-consuming and error-prone. One approach to alleviate this is structured reporting, which saves time and enables a more accurate evaluation than free-text reports. However, there is limited research on automating structured reporting, and no public benchmark is available for evaluating and comparing different methods. To close this gap, we introduce Rad-ReStruct, a new benchmark dataset that provides fine-grained, hierarchically ordered annotations in the form of structured reports for X-Ray images. We model the structured reporting task as hierarchical visual question answering (VQA) and propose hi-VQA, a novel method that considers prior context in the form of previously asked questions and answers for populating a structured radiology report. Our experiments show that hi-VQA achieves competitive performance to the state-of-the-art on the medical VQA benchmark VQARad while performing best among methods without domain-specific vision-language pretraining and provides a strong baseline on Rad-ReStruct. Our work represents a significant step towards the automated population of structured radiology reports and provides a valuable first benchmark for future research in this area. Our dataset and code is available at https://github.com/ChantalMP/Rad-ReStruct.
翻译:摘要:放射学报告是放射科医生与其他医疗专业人员之间沟通的关键环节,但其过程耗时且易出错。结构化报告作为一种缓解方案,相比自由文本报告能节省时间并实现更准确的评估。然而,目前关于自动生成结构化报告的研究有限,且缺乏公开基准来评估和比较不同方法。为填补这一空白,我们提出了Rad-ReStruct——一个新的基准数据集,以结构化报告形式为X光图像提供细粒度、分层排序的标注。我们将结构化报告任务建模为分层视觉问答(VQA),并提出hi-VQA方法,该方法通过利用先前提出的问题与答案作为上下文信息来填充结构化放射报告。实验表明,hi-VQA在医学VQA基准VQARad上达到了与当前最优方法相媲美的性能,同时在没有领域特定视觉-语言预训练的方法中表现最佳,并在Rad-ReStruct上提供了强基线。我们的工作标志着向自动生成结构化放射报告迈出的重要一步,并为该领域的未来研究提供了首个有价值的基准。数据集与代码已开源至:https://github.com/ChantalMP/Rad-ReStruct。