Visual question answering-based image-finding generation for pulmonary nodules on chest CT from structured annotations

Interpretation of imaging findings based on morphological characteristics is important for diagnosing pulmonary nodules on chest computed tomography (CT) images. In this study, we constructed a visual question answering (VQA) dataset from structured data in an open dataset and investigated an image-finding generation method for chest CT images, with the aim of enabling interactive diagnostic support that presents findings based on questions that reflect physicians' interests rather than fixed descriptions. In this study, chest CT images included in the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) datasets were used. Regions of interest surrounding the pulmonary nodules were extracted from these images, and image findings and questions were defined based on morphological characteristics recorded in the database. A dataset comprising pairs of cropped images, corresponding questions, and image findings was constructed, and the VQA model was fine-tuned on it. Language evaluation metrics such as BLEU were used to evaluate the generated image findings. The VQA dataset constructed using the proposed method contained image findings with natural expressions as radiological descriptions. In addition, the generated image findings showed a high CIDEr score of 3.896, and a high agreement with the reference findings was obtained through evaluation based on morphological characteristics. We constructed a VQA dataset for chest CT images using structured information on the morphological characteristics from the LIDC-IDRI dataset. Methods for generating image findings in response to these questions have also been investigated. Based on the generated results and evaluation metric scores, the proposed method was effective as an interactive diagnostic support system that can present image findings according to physicians' interests.

翻译：基于形态学特征的影像学发现解读对于胸部计算机断层扫描（CT）图像中肺结节的诊断至关重要。本研究利用开放数据集中的结构化数据构建了一个视觉问答（VQA）数据集，并研究了一种针对胸部CT图像的图像发现生成方法，旨在实现交互式诊断支持，能够根据反映医生关注点的问题（而非固定描述）呈现影像发现。本研究使用了肺影像数据库联盟与影像数据库资源倡议（LIDC-IDRI）数据集中包含的胸部CT图像。从这些图像中提取了肺结节周围的感兴趣区域，并根据数据库中记录的形态学特征定义了影像发现和对应问题。构建了一个包含裁剪图像、对应问题及影像发现的数据集，并在此基础上对VQA模型进行了微调。采用BLEU等语言评估指标对生成的影像发现进行了评价。通过所提方法构建的VQA数据集包含了具有自然表达形式的放射学描述作为影像发现。此外，生成的影像发现显示出较高的CIDEr得分（3.896），并且基于形态学特征的评估表明其与参考发现具有高度一致性。我们利用LIDC-IDRI数据集中关于形态学特征的结构化信息，构建了一个针对胸部CT图像的VQA数据集。同时研究了针对这些问题生成影像发现的方法。基于生成结果和评估指标得分，所提方法作为一种能够根据医生关注点呈现影像发现的交互式诊断支持系统是有效的。