We approach the challenge of addressing semi-supervised domain generalization (SSDG). Specifically, our aim is to obtain a model that learns domain-generalizable features by leveraging a limited subset of labelled data alongside a substantially larger pool of unlabeled data. Existing domain generalization (DG) methods which are unable to exploit unlabeled data perform poorly compared to semi-supervised learning (SSL) methods under SSDG setting. Nevertheless, SSL methods have considerable room for performance improvement when compared to fully-supervised DG training. To tackle this underexplored, yet highly practical problem of SSDG, we make the following core contributions. First, we propose a feature-based conformity technique that matches the posterior distributions from the feature space with the pseudo-label from the model's output space. Second, we develop a semantics alignment loss to learn semantically-compatible representations by regularizing the semantic structure in the feature space. Our method is plug-and-play and can be readily integrated with different SSL-based SSDG baselines without introducing any additional parameters. Extensive experimental results across five challenging DG benchmarks with four strong SSL baselines suggest that our method provides consistent and notable gains in two different SSDG settings.
翻译:我们致力于解决半监督域泛化(SSDG)问题。具体而言,我们的目标是通过利用有限标注数据子集与大量未标注数据,训练出具备域泛化特征的模型。现有无法利用未标注数据的域泛化(DG)方法在半监督域泛化设定下的表现显著弱于半监督学习(SSL)方法。然而,与全监督域泛化训练相比,半监督学习方法在性能上仍有较大提升空间。为解决这一虽未充分探索却极具实用价值的SSDG问题,我们提出以下核心贡献:首先,提出一种基于特征的一致性技术,该技术将特征空间的后验分布与模型输出空间的伪标签进行匹配;其次,设计语义对齐损失函数,通过约束特征空间的语义结构来学习语义兼容的表示。我们的方法即插即用,可无缝集成至基于SSL的不同SSDG基线方法中,且无需引入额外参数。在五个具有挑战性的域泛化基准测试中,结合四种强SSL基线的大量实验结果表明,我们的方法在两种不同SSDG设定下均能实现持续且显著的性能提升。