Pairing-free Group-level Knowledge Distillation for Robust Gastrointestinal Lesion Classification in White-Light Endoscopy

White-Light Imaging (WLI) is the standard for endoscopic cancer screening, but Narrow-Band Imaging (NBI) offers superior diagnostic details. A key challenge is transferring knowledge from NBI to enhance WLI-only models, yet existing methods are critically hampered by their reliance on paired NBI-WLI images of the same lesion, a costly and often impractical requirement that leaves vast amounts of clinical data untapped. In this paper, we break this paradigm by introducing PaGKD, a novel Pairing-free Group-level Knowledge Distillation framework that that enables effective cross-modal learning using unpaired WLI and NBI data. Instead of forcing alignment between individual, often semantically mismatched image instances, PaGKD operates at the group level to distill more complete and compatible knowledge across modalities. Central to PaGKD are two complementary modules: (1) Group-level Prototype Distillation (GKD-Pro) distills compact group representations by extracting modality-invariant semantic prototypes via shared lesion-aware queries; (2) Group-level Dense Distillation (GKD-Den) performs dense cross-modal alignment by guiding group-aware attention with activation-derived relation maps. Together, these modules enforce global semantic consistency and local structural coherence without requiring image-level correspondence. Extensive experiments on four clinical datasets demonstrate that PaGKD consistently and significantly outperforms state-of-the-art methods, achieving relative AUC improvements of 3.3%, 1.1%, 2.8%, and 3.2%, respectively, establishing a new direction for cross-modal learning from unpaired data.

翻译：白光成像（WLI）是内镜癌症筛查的标准技术，但窄带成像（NBI）能提供更优的诊断细节。一个关键挑战在于如何将NBI的知识迁移至仅使用WLI的模型以提升其性能，然而现有方法严重受限于其对同一病灶的成对NBI-WLI图像的依赖，这一要求成本高昂且通常不切实际，导致大量临床数据未被利用。本文打破这一范式，提出了PaGKD，一种新颖的免配对组级知识蒸馏框架，能够利用非配对的WLI和NBI数据进行有效的跨模态学习。PaGKD不强制在个体图像实例（通常存在语义失配）之间进行对齐，而是在组级别进行操作，以蒸馏出更完整且兼容的跨模态知识。PaGKD的核心是两个互补模块：（1）组级原型蒸馏（GKD-Pro）通过共享的病灶感知查询提取模态不变的语义原型，从而蒸馏出紧凑的组表示；（2）组级密集蒸馏（GKD-Den）通过利用激活导出的关系图引导组感知注意力，执行密集的跨模态对齐。这些模块共同作用，在无需图像级对应关系的情况下，强制执行全局语义一致性和局部结构连贯性。在四个临床数据集上的大量实验表明，PaGKD始终显著优于现有最先进方法，相对AUC分别提升了3.3%、1.1%、2.8%和3.2%，为非配对数据的跨模态学习确立了新方向。