MultiMatch: Multi-task Learning for Semi-supervised Domain Generalization

Domain generalization (DG) aims at learning a model on source domains to well generalize on the unseen target domain. Although it has achieved great success, most of existing methods require the label information for all training samples in source domains, which is time-consuming and expensive in the real-world application. In this paper, we resort to solving the semi-supervised domain generalization (SSDG) task, where there are a few label information in each source domain. To address the task, we first analyze the theory of the multi-domain learning, which highlights that 1) mitigating the impact of domain gap and 2) exploiting all samples to train the model can effectively reduce the generalization error in each source domain so as to improve the quality of pseudo-labels. According to the analysis, we propose MultiMatch, i.e., extending FixMatch to the multi-task learning framework, producing the high-quality pseudo-label for SSDG. To be specific, we consider each training domain as a single task (i.e., local task) and combine all training domains together (i.e., global task) to train an extra task for the unseen test domain. In the multi-task framework, we utilize the independent BN and classifier for each task, which can effectively alleviate the interference from different domains during pseudo-labeling. Also, most of parameters in the framework are shared, which can be trained by all training samples sufficiently. Moreover, to further boost the pseudo-label accuracy and the model's generalization, we fuse the predictions from the global task and local task during training and testing, respectively. A series of experiments validate the effectiveness of the proposed method, and it outperforms the existing semi-supervised methods and the SSDG method on several benchmark DG datasets.

翻译：域泛化旨在学习源域模型以泛化至未见的目标域。尽管现有方法已取得显著成功，但大多数方法要求源域中所有训练样本的标签信息，这在现实应用中既耗时且昂贵。本文致力于解决半监督领域泛化任务，其中每个源域仅包含少量标签信息。针对该任务，我们首先分析了多域学习理论，该理论强调：1) 减轻域间差距的影响，以及2) 利用所有样本训练模型，可以有效降低每个源域的泛化误差，从而提升伪标签质量。基于分析，我们提出MultiMatch，即将FixMatch扩展至多任务学习框架，为半监督领域泛化生成高质量伪标签。具体而言，我们将每个训练域视为独立任务（即局部任务），并整合所有训练域（即全局任务）为不可见测试域训练额外任务。在该多任务框架中，我们为每个任务使用独立的批归一化层和分类器，可有效缓解伪标签过程中不同域的干扰。同时，框架中大部分参数共享，能够被所有训练样本充分训练。此外，为进一步提升伪标签准确率与模型泛化能力，我们在训练与测试阶段分别融合全局任务与局部任务的预测结果。一系列实验验证了所提方法的有效性，其在多个基准域泛化数据集上优于现有半监督方法及半监督领域泛化方法。