Domain Generalization (DG) endeavors to create machine learning models that excel in unseen scenarios by learning invariant features. In DG, the prevalent practice of constraining models to a fixed structure or uniform parameterization to encapsulate invariant features can inadvertently blend specific aspects. Such an approach struggles with nuanced differentiation of inter-domain variations and may exhibit bias towards certain domains, hindering the precise learning of domain-invariant features. Recognizing this, we introduce a novel method designed to supplement the model with domain-level and task-specific characteristics. This approach aims to guide the model in more effectively separating invariant features from specific characteristics, thereby boosting the generalization. Building on the emerging trend of visual prompts in the DG paradigm, our work introduces the novel \textbf{H}ierarchical \textbf{C}ontrastive \textbf{V}isual \textbf{P}rompt (HCVP) methodology. This represents a significant advancement in the field, setting itself apart with a unique generative approach to prompts, alongside an explicit model structure and specialized loss functions. Differing from traditional visual prompts that are often shared across entire datasets, HCVP utilizes a hierarchical prompt generation network enhanced by prompt contrastive learning. These generative prompts are instance-dependent, catering to the unique characteristics inherent to different domains and tasks. Additionally, we devise a prompt modulation network that serves as a bridge, effectively incorporating the generated visual prompts into the vision transformer backbone. Experiments conducted on five DG datasets demonstrate the effectiveness of HCVP, outperforming both established DG algorithms and adaptation protocols.
翻译:领域泛化(Domain Generalization, DG)致力于通过学习不变特征,构建在未见场景中表现优异的机器学习模型。当前DG方法普遍采用固定结构或统一参数化来约束模型以封装不变特征,但这种方式可能无意中混淆特定领域特征。此类方法难以细致区分域间差异,并可能对某些领域产生偏好,从而阻碍对领域不变特征的精确学习。认识到这一问题,我们提出一种新方法,旨在为模型补充领域层级与任务特定的特征。该方法试图引导模型更有效地分离不变特征与特定特征,从而提升泛化能力。基于视觉提示在DG范式中的新兴趋势,本研究提出了创新的**层次化对比视觉提示**(Hierarchical Contrastive Visual Prompt, HCVP)方法。这一方法通过独特的提示生成机制、显式的模型结构及专用损失函数,实现了该领域的显著进展。与传统常在整个数据集中共享的视觉提示不同,HCVP采用经提示对比学习增强的层次化提示生成网络。这些生成式提示具有实例依赖性,能够适应不同领域和任务的内在特性。此外,我们设计了一个提示调制网络作为桥梁,将生成的视觉提示有效整合到视觉Transformer主干网络中。在五个DG数据集上的实验验证了HCVP的有效性,其性能优于现有DG算法及自适应协议。