Domain Generalization (DG) seeks to transfer knowledge from multiple source domains to unseen target domains, even in the presence of domain shifts. Achieving effective generalization typically requires a large and diverse set of labeled source data to learn robust representations that can generalize to new, unseen domains. However, obtaining such high-quality labeled data is often costly and labor-intensive, limiting the practical applicability of DG. To address this, we investigate a more practical and challenging problem: semi-supervised domain generalization (SSDG) under a label-efficient paradigm. In this paper, we propose a novel method, CAT, which leverages semi-supervised learning with limited labeled data to achieve competitive generalization performance under domain shifts. Our method addresses key limitations of previous approaches, such as reliance on fixed thresholds and sensitivity to noisy pseudo-labels. CAT combines adaptive thresholding with noisy label refinement techniques, creating a straightforward yet highly effective solution for SSDG tasks. Specifically, our approach uses flexible thresholding to generate high-quality pseudo-labels with higher class diversity while refining noisy pseudo-labels to improve their reliability. Extensive experiments across multiple benchmark datasets demonstrate the superior performance of our method, highlighting its effectiveness in achieving robust generalization under domain shift.
翻译:域泛化(Domain Generalization, DG)旨在将知识从多个源域迁移到未见过的目标域,即使在存在域偏移的情况下亦然。实现有效的泛化通常需要大量且多样化的标注源数据,以学习能够泛化到新的未见域的鲁棒表示。然而,获取此类高质量标注数据通常成本高昂且劳动密集,限制了DG的实际应用。为解决此问题,我们研究了一个更实际且更具挑战性的问题:标签高效范式下的半监督域泛化(Semi-Supervised Domain Generalization, SSDG)。本文提出了一种新颖的方法CAT,该方法利用有限标注数据进行半监督学习,以在域偏移下实现具有竞争力的泛化性能。我们的方法解决了先前方法的关键局限性,例如对固定阈值的依赖以及对噪声伪标签的敏感性。CAT将自适应阈值技术与噪声标签细化技术相结合,为SSDG任务提供了一个简单而高效的解决方案。具体而言,我们的方法使用灵活阈值化生成具有更高类别多样性的高质量伪标签,同时细化噪声伪标签以提高其可靠性。在多个基准数据集上的大量实验证明了我们方法的优越性能,突显了其在域偏移下实现鲁棒泛化的有效性。