Visual Domain Prompts (VDP) have shown promising potential in addressing visual cross-domain problems. Existing methods adopt VDP in classification domain adaptation (DA), such as tuning image-level or feature-level prompts for target domains. Since the previous dense prompts are opaque and mask out continuous spatial details in the prompt regions, it will suffer from inaccurate contextual information extraction and insufficient domain-specific feature transferring when dealing with the dense prediction (i.e. semantic segmentation) DA problems. Therefore, we propose a novel Sparse Visual Domain Prompts (SVDP) approach tailored for addressing domain shift problems in semantic segmentation, which holds minimal discrete trainable parameters (e.g. 10\%) of the prompt and reserves more spatial information. To better apply SVDP, we propose Domain Prompt Placement (DPP) method to adaptively distribute several SVDP on regions with large data distribution distance based on uncertainty guidance. It aims to extract more local domain-specific knowledge and realizes efficient cross-domain learning. Furthermore, we design a Domain Prompt Updating (DPU) method to optimize prompt parameters differently for each target domain sample with different degrees of domain shift, which helps SVDP to better fit target domain knowledge. Experiments, which are conducted on the widely-used benchmarks (Cityscapes, Foggy-Cityscapes, and ACDC), show that our proposed method achieves state-of-the-art performances on the source-free adaptations, including six Test Time Adaptation and one Continual Test-Time Adaptation in semantic segmentation.
翻译:视觉域提示(VDP)在解决视觉跨域问题中展现出巨大潜力。现有方法将VDP应用于分类域适应(DA),例如为目标域调整图像级或特征级提示。由于先前的密集提示具有不透明性,会掩盖提示区域中的连续空间细节,在处理密集预测(即语义分割)DA问题时,将面临上下文信息提取不准确和域特定特征迁移不足的问题。为此,我们提出一种新颖的稀疏视觉域提示(SVDP)方法,专为解决语义分割中的域偏移问题而设计,该方法仅保留提示中最少的离散可训练参数(例如10%),并保留更多空间信息。为更好地应用SVDP,我们提出域提示定位(DPP)方法,基于不确定性引导,自适应地将多个SVDP分布到数据分布距离较大的区域,旨在提取更多局部域特定知识,实现高效跨域学习。此外,我们设计了域提示更新(DPU)方法,针对具有不同程度域偏移的每个目标域样本,以不同方式优化提示参数,这有助于SVDP更好地拟合目标域知识。在广泛使用的基准数据集(Cityscapes、Foggy-Cityscapes和ACDC)上进行的实验表明,我们提出的方法在无源域适应中达到了最先进性能,包括语义分割中的六项测试时适应和一项连续测试时适应。