To address these issues, we propose a novel Adaptive patch-word Matching (AdaMatch) model to correlate chest X-ray (CXR) image regions with words in medical reports and apply it to CXR-report generation to provide explainability for the generation process. AdaMatch exploits the fine-grained relation between adaptive patches and words to provide explanations of specific image regions with corresponding words. To capture the abnormal regions of varying sizes and positions, we introduce the Adaptive Patch extraction (AdaPatch) module to acquire the adaptive patches for these regions adaptively. In order to provide explicit explainability for CXR-report generation task, we propose an AdaMatch-based bidirectional large language model for Cyclic CXR-report generation (AdaMatch-Cyclic). It employs the AdaMatch to obtain the keywords for CXR images and `keypatches' for medical reports as hints to guide CXR-report generation. Extensive experiments on two publicly available CXR datasets prove the effectiveness of our method and its superior performance to existing methods.
翻译:为解决上述问题,我们提出一种新颖的自适应补丁-词汇匹配(AdaMatch)模型,用于关联胸部X光(CXR)图像区域与医学报告中的词汇,并将其应用于CXR-报告生成任务,以提供生成过程的可解释性。AdaMatch利用自适应补丁与词汇间的细粒度关系,通过对应词汇解释特定图像区域。为捕捉位置与大小各异的异常区域,我们引入自适应补丁提取(AdaPatch)模块,自适应地获取这些区域的补丁。为显式提供CXR-报告生成任务的可解释性,我们提出基于AdaMatch的双向大语言模型用于循环CXR-报告生成(AdaMatch-Cyclic)。该模型利用AdaMatch获取CXR图像的关键词与医学报告的“关键补丁”作为提示,引导CXR-报告生成。在两个公开CXR数据集上的大量实验证明了我们方法的有效性及其优于现有方法的性能表现。