We approach the challenge of addressing semi-supervised domain generalization (SSDG). Specifically, our aim is to obtain a model that learns domain-generalizable features by leveraging a limited subset of labelled data alongside a substantially larger pool of unlabeled data. Existing domain generalization (DG) methods which are unable to exploit unlabeled data perform poorly compared to semi-supervised learning (SSL) methods under SSDG setting. Nevertheless, SSL methods have considerable room for performance improvement when compared to fully-supervised DG training. To tackle this underexplored, yet highly practical problem of SSDG, we make the following core contributions. First, we propose a feature-based conformity technique that matches the posterior distributions from the feature space with the pseudo-label from the model's output space. Second, we develop a semantics alignment loss to learn semantically-compatible representations by regularizing the semantic structure in the feature space. Our method is plug-and-play and can be readily integrated with different SSL-based SSDG baselines without introducing any additional parameters. Extensive experimental results across five challenging DG benchmarks with four strong SSL baselines suggest that our method provides consistent and notable gains in two different SSDG settings.
翻译:我们针对半监督领域泛化(SSDG)这一挑战展开研究。具体而言,目标是通过利用有限的标注数据子集与大量未标注数据,学习具有领域泛化能力的特征。现有领域泛化方法无法利用未标注数据,在SSDG设置下表现逊于半监督学习方法。然而,相比全监督领域泛化训练,半监督方法仍存在显著的性能提升空间。为解决这一尚未充分探索但极具实用价值的SSDG问题,我们作出以下核心贡献:首先,提出基于特征的一致性技术,通过匹配特征空间的后验分布与模型输出空间的伪标签;其次,开发语义对齐损失函数,通过正则化特征空间的语义结构学习语义兼容的表征。该方法即插即用,无需引入额外参数即可与多种基于半监督学习的SSDG基线方法无缝集成。基于四个强半监督基线的五个挑战性领域泛化基准实验表明,本方法在两种不同SSDG设置下均能带来持续且显著的性能提升。