The goal of Universal Cross-Domain Retrieval (UCDR) is to achieve robust performance in generalized test scenarios, wherein data may belong to strictly unknown domains and categories during training. Recently, pre-trained models with prompt tuning have shown strong generalization capabilities and attained noteworthy achievements in various downstream tasks, such as few-shot learning and video-text retrieval. However, applying them directly to UCDR may not sufficiently to handle both domain shift (i.e., adapting to unfamiliar domains) and semantic shift (i.e., transferring to unknown categories). To this end, we propose \textbf{Pro}mpting-to-\textbf{S}imulate (ProS), the first method to apply prompt tuning for UCDR. ProS employs a two-step process to simulate Content-aware Dynamic Prompts (CaDP) which can impact models to produce generalized features for UCDR. Concretely, in Prompt Units Learning stage, we introduce two Prompt Units to individually capture domain and semantic knowledge in a mask-and-align way. Then, in Context-aware Simulator Learning stage, we train a Content-aware Prompt Simulator under a simulated test scenarios to produce the corresponding CaDP. Extensive experiments conducted on three benchmark datasets show that our method achieves new state-of-the-art performance without bringing excessive parameters. Our method is publicly available at https://github.com/fangkaipeng/ProS.
翻译:通用跨域检索(UCDR)的目标是在泛化测试场景中实现稳健性能,其中训练时数据可能属于严格未知的领域和类别。近年来,基于提示调优的预训练模型展现出强大的泛化能力,并在少样本学习、视频-文本检索等下游任务中取得显著成果。然而,将其直接应用于UCDR可能不足以同时处理领域偏移(即适应未知领域)和语义偏移(即迁移至未知类别)。为此,我们提出 **Pro**mpting-to-**S**imulate (ProS),这是首个将提示调优应用于UCDR的方法。ProS通过两步流程生成内容感知动态提示(CaDP),从而引导模型产生适用于UCDR的泛化特征。具体而言,在提示单元学习阶段,我们引入两个提示单元,通过掩码对齐方式分别捕获领域知识和语义知识。随后,在上下文感知模拟器学习阶段,我们训练一个内容感知提示模拟器,在模拟测试场景下生成对应的CaDP。在三个基准数据集上的大量实验表明,我们的方法在不引入过多参数的前提下实现了新的最先进性能。代码已开源至 https://github.com/fangkaipeng/ProS。