Multi-Source Domain Adaptation (MSDA) is a challenging scenario where multiple related and heterogeneous source datasets must be adapted to an unlabeled target dataset. Conventional MSDA methods often overlook that data holders may have privacy concerns, hindering direct data sharing. In response, decentralized MSDA has emerged as a promising strategy to achieve adaptation without centralizing clients' data. Our work proposes a novel approach, Decentralized Dataset Dictionary Learning, to address this challenge. Our method leverages Wasserstein barycenters to model the distributional shift across multiple clients, enabling effective adaptation while preserving data privacy. Specifically, our algorithm expresses each client's underlying distribution as a Wasserstein barycenter of public atoms, weighted by private barycentric coordinates. Our approach ensures that the barycentric coordinates remain undisclosed throughout the adaptation process. Extensive experimentation across five visual domain adaptation benchmarks demonstrates the superiority of our strategy over existing decentralized MSDA techniques. Moreover, our method exhibits enhanced robustness to client parallelism while maintaining relative resilience compared to conventional decentralized MSDA methodologies.
翻译:多源域适应(MSDA)是一个具有挑战性的场景,其中多个相关且异构的源数据集需要适应一个未标记的目标数据集。传统的MSDA方法常常忽视数据持有者可能存在的隐私顾虑,从而阻碍了直接的数据共享。为此,去中心化的MSDA已成为一种在不集中客户端数据的情况下实现适应的有前景的策略。我们的工作提出了一种新颖的方法——去中心化数据集字典学习——以应对这一挑战。我们的方法利用Wasserstein重心对多个客户端之间的分布偏移进行建模,从而在保护数据隐私的同时实现有效的适应。具体而言,我们的算法将每个客户端的底层分布表示为公共原子的Wasserstein重心,并通过私有的重心坐标进行加权。我们的方法确保了重心坐标在整个适应过程中保持不公开。在五个视觉域适应基准上的广泛实验证明了我们的策略优于现有的去中心化MSDA技术。此外,与传统的去中心化MSDA方法相比,我们的方法在保持相对弹性的同时,展现了对客户端并行性的增强鲁棒性。