You've Got Two Teachers: Co-evolutionary Image and Report Distillation for Semi-supervised Anatomical Abnormality Detection in Chest X-ray

Chest X-ray (CXR) anatomical abnormality detection aims at localizing and characterising cardiopulmonary radiological findings in the radiographs, which can expedite clinical workflow and reduce observational oversights. Most existing methods attempted this task in either fully supervised settings which demanded costly mass per-abnormality annotations, or weakly supervised settings which still lagged badly behind fully supervised methods in performance. In this work, we propose a co-evolutionary image and report distillation (CEIRD) framework, which approaches semi-supervised abnormality detection in CXR by grounding the visual detection results with text-classified abnormalities from paired radiology reports, and vice versa. Concretely, based on the classical teacher-student pseudo label distillation (TSD) paradigm, we additionally introduce an auxiliary report classification model, whose prediction is used for report-guided pseudo detection label refinement (RPDLR) in the primary vision detection task. Inversely, we also use the prediction of the vision detection model for abnormality-guided pseudo classification label refinement (APCLR) in the auxiliary report classification task, and propose a co-evolution strategy where the vision and report models mutually promote each other with RPDLR and APCLR performed alternatively. To this end, we effectively incorporate the weak supervision by reports into the semi-supervised TSD pipeline. Besides the cross-modal pseudo label refinement, we further propose an intra-image-modal self-adaptive non-maximum suppression, where the pseudo detection labels generated by the teacher vision model are dynamically rectified by high-confidence predictions by the student. Experimental results on the public MIMIC-CXR benchmark demonstrate CEIRD's superior performance to several up-to-date weakly and semi-supervised methods.

翻译：胸部X光片（CXR）解剖异常检测旨在定位和表征X光片中的心肺放射学发现，这可以加快临床工作流程并减少观察疏忽。现有方法大多在全监督设置（需要耗时的逐异常标注）或弱监督设置（性能仍远落后于全监督方法）下尝试此任务。本研究提出一种共进化图像与报告蒸馏（CEIRD）框架，通过将视觉检测结果与配对放射学报告中文本分类的异常相结合（反之亦然），实现CXR半监督异常检测。具体而言，基于经典的教师-学生伪标签蒸馏（TSD）范式，我们额外引入一个辅助报告分类模型，其预测用于主要视觉检测任务中的报告引导伪检测标签精炼（RPDLR）。相反，我们还将视觉检测模型的预测用于辅助报告分类任务中的异常引导伪分类标签精炼（APCLR），并提出一种共进化策略，其中视觉与报告模型通过交替执行RPDLR和APCLR相互促进。由此，我们将报告的弱监督有效整合到半监督TSD流程中。除跨模态伪标签精炼外，我们还提出一种图像模态内自适应非极大值抑制，其中教师视觉模型生成的伪检测标签由学生的高置信度预测动态修正。在公开MIMIC-CXR基准上的实验结果表明，CEIRD相对于多种最新弱监督及半监督方法具有优越性能。