Multi-source Domain Adaptation (MDA) seeks to adapt models trained on data from multiple labeled source domains to perform effectively on an unlabeled target domain data, assuming access to sources data. To address the challenges of model adaptation and data privacy, we introduce Collaborative MDA Through Optimal Transport (CMDA-OT), a novel framework consisting of two key phases. In the first phase, each source domain is independently adapted to the target domain using optimal transport methods. In the second phase, a centralized collaborative learning architecture is employed, which aggregates the N models from the N sources without accessing their data, thereby safeguarding privacy. During this process, the server leverages a small set of pseudo-labeled samples from the target domain, known as the target validation subset, to refine and guide the adaptation. This dual-phase approach not only improves model performance on the target domain but also addresses vital privacy challenges inherent in domain adaptation.
翻译:多源域自适应(MDA)旨在利用多个带标签的源域数据训练模型,并将其有效适应到无标签的目标域数据,前提是能够访问源域数据。为应对模型适应与数据隐私的挑战,我们提出了基于最优传输的协同多源域自适应(CMDA-OT),这是一个包含两个关键阶段的新颖框架。在第一阶段,每个源域通过最优传输方法独立地适应目标域。在第二阶段,采用集中式协同学习架构,该架构聚合来自N个源的N个模型而无需访问其数据,从而保护隐私。在此过程中,服务器利用目标域的一小部分伪标记样本(称为目标验证子集)来优化和指导适应过程。这种双阶段方法不仅提升了模型在目标域上的性能,也解决了域自适应中固有的关键隐私挑战。