Multi-Target Domain Adaptation (MTDA) entails learning domain-invariant information from a single source domain and applying it to multiple unlabeled target domains. Yet, existing MTDA methods predominantly focus on addressing domain shifts within visual features, often overlooking semantic features and struggling to handle unknown classes, resulting in what is known as Open-Set (OS) MTDA. While large-scale vision-language foundation models like CLIP show promise, their potential for MTDA remains largely unexplored. This paper introduces COSMo, a novel method that learns domain-agnostic prompts through source domain-guided prompt learning to tackle the MTDA problem in the prompt space. By leveraging a domain-specific bias network and separate prompts for known and unknown classes, COSMo effectively adapts across domain and class shifts. To the best of our knowledge, COSMo is the first method to address Open-Set Multi-Target DA (OSMTDA), offering a more realistic representation of real-world scenarios and addressing the challenges of both open-set and multi-target DA. COSMo demonstrates an average improvement of $5.1\%$ across three challenging datasets: Mini-DomainNet, Office-31, and Office-Home, compared to other related DA methods adapted to operate within the OSMTDA setting. Code is available at: https://github.com/munish30monga/COSMo
翻译:多目标域自适应(MTDA)旨在从单个源域学习域不变信息,并将其应用于多个未标记的目标域。然而,现有的MTDA方法主要侧重于解决视觉特征中的域偏移,往往忽略语义特征,且难以处理未知类别,从而形成所谓的开放集(OS)MTDA。尽管像CLIP这样的大规模视觉-语言基础模型展现出潜力,但其在MTDA中的应用仍很大程度上未被探索。本文提出了COSMo,一种通过在提示空间中利用源域引导的提示学习来学习域无关提示的新方法,以解决MTDA问题。通过利用一个域特定偏置网络以及为已知和未知类别分别设置提示,COSMo能有效适应域偏移和类别偏移。据我们所知,COSMo是首个解决开放集多目标域自适应(OSMTDA)的方法,它提供了对现实场景更真实的表示,并同时应对了开放集和多目标域自适应的挑战。相较于其他适应于OSMTDA设置的相关域自适应方法,COSMo在三个具有挑战性的数据集——Mini-DomainNet、Office-31和Office-Home上平均提升了$5.1\%$。代码可在以下网址获取:https://github.com/munish30monga/COSMo