Domain generalization (DG) aims to learn from multiple known source domains a model that can generalize well to unknown target domains. The existing DG methods usually exploit the fusion of shared multi-source data to train a generalizable model. However, tremendous data is distributed across lots of places nowadays that can not be shared due to privacy policies. In this paper, we tackle the problem of federated domain generalization where the source datasets can only be accessed and learned locally for privacy protection. We propose a novel framework called Collaborative Semantic Aggregation and Calibration (CSAC) to enable this challenging problem. To fully absorb multi-source semantic information while avoiding unsafe data fusion, we conduct data-free semantic aggregation by fusing the models trained on the separated domains layer-by-layer. To address the semantic dislocation problem caused by domain shift, we further design cross-layer semantic calibration with an attention mechanism to align each semantic level and enhance domain invariance. We unify multi-source semantic learning and alignment in a collaborative way by repeating the semantic aggregation and calibration alternately, keeping each dataset localized, and the data privacy is carefully protected. Extensive experiments show the significant performance of our method in addressing this challenging problem.
翻译:域泛化旨在从多个已知源域学习一个能泛化至未知目标域的模型。现有域泛化方法通常利用共享多源数据的融合来训练泛化模型。然而,当前大量数据分布在众多区域,由于隐私政策限制无法共享。本文针对联邦域泛化问题展开研究,在该场景下源数据集仅能本地访问与学习以保护隐私。我们提出了一种名为"协同语义聚合与校准"(CSAC)的新型框架来解决这一挑战性问题。为充分吸收多源语义信息同时避免不安全的数据融合,我们通过逐层聚合分离域上训练的模型实现无数据语义聚合。为应对域偏移导致的语义错位问题,我们进一步设计了基于注意力机制的跨层语义校准,以对齐各语义层级并增强域不变性。通过交替重复语义聚合与校准过程,我们以协同方式统一多源语义学习与对齐,确保各数据集保持本地化且数据隐私得到严格保护。大量实验表明,该方法在解决这一挑战性问题中具有显著性能优势。