Source-free cross-modal knowledge transfer is a crucial yet challenging task, which aims to transfer knowledge from one source modality (e.g., RGB) to the target modality (e.g., depth or infrared) with no access to the task-relevant (TR) source data due to memory and privacy concerns. A recent attempt leverages the paired task-irrelevant (TI) data and directly matches the features from them to eliminate the modality gap. However, it ignores a pivotal clue that the paired TI data could be utilized to effectively estimate the source data distribution and better facilitate knowledge transfer to the target modality. To this end, we propose a novel yet concise framework to unlock the potential of paired TI data for enhancing source-free cross-modal knowledge transfer. Our work is buttressed by two key technical components. Firstly, to better estimate the source data distribution, we introduce a Task-irrelevant data-Guided Modality Bridging (TGMB) module. It translates the target modality data (e.g., infrared) into the source-like RGB images based on paired TI data and the guidance of the available source model to alleviate two key gaps: 1) inter-modality gap between the paired TI data; 2) intra-modality gap between TI and TR target data. We then propose a Task-irrelevant data-Guided Knowledge Transfer (TGKT) module that transfers knowledge from the source model to the target model by leveraging the paired TI data. Notably, due to the unavailability of labels for the TR target data and its less reliable prediction from the source model, our TGKT model incorporates a self-supervised pseudo-labeling approach to enable the target model to learn from its predictions. Extensive experiments show that our method achieves state-of-the-art performance on three datasets (RGB-to-depth and RGB-to-infrared).
翻译:无源跨模态知识迁移是一项关键但具有挑战性的任务,其目标是在无法访问与任务相关(TR)源数据(出于内存和隐私考虑)的情况下,将知识从一种源模态(例如RGB)迁移到目标模态(例如深度或红外)。最近的一种方法利用配对的与任务无关(TI)数据,直接匹配这些数据的特征以消除模态差距。然而,该方法忽略了一个关键线索:配对TI数据可用于有效估计源数据分布,从而更好地促进向目标模态的知识迁移。为此,我们提出了一种新颖而简洁的框架,以释放配对TI数据的潜力,增强无源跨模态知识迁移。我们的工作由两个关键技术组件支撑。首先,为更好地估计源数据分布,我们引入了任务无关数据引导的模态桥接(TGMB)模块。该模块基于配对TI数据和现有源模型的指导,将目标模态数据(例如红外)转换为类似源的RGB图像,以缓解两个关键差距:1)配对TI数据之间的模态间差距;2)TI与TR目标数据之间的模态内差距。其次,我们提出任务无关数据引导的知识迁移(TGKT)模块,通过利用配对TI数据将知识从源模型迁移到目标模型。值得注意的是,由于TR目标数据标签不可用且从源模型获得的预测可靠性较低,我们的TGKT模型结合了自监督伪标签方法,使目标模型能够从其预测中学习。大量实验表明,我们的方法在三个数据集(RGB到深度和RGB到红外)上达到了最先进的性能。