Domain adaptation has attracted a great deal of attention in the machine learning community, but it requires access to source data, which often raises concerns about data privacy. We are thus motivated to address these issues and propose a simple yet efficient method. This work treats domain adaptation as an unsupervised clustering problem and trains the target model without access to the source data. Specifically, we propose a loss function called contrast and clustering (CaC), where a positive pair term pulls neighbors belonging to the same class together in the feature space to form clusters, while a negative pair term pushes samples of different classes apart. In addition, extended neighbors are taken into account by querying the nearest neighbor indexes in the memory bank to mine for more valuable negative pairs. Extensive experiments on three common benchmarks, VisDA, Office-Home and Office-31, demonstrate that our method achieves state-of-the-art performance. The code will be made publicly available at https://github.com/yukilulu/CaC.
翻译:领域适应在机器学习领域备受关注,但其通常需要访问源数据,这常引发数据隐私方面的担忧。为应对这些问题,我们提出一种简洁而高效的方法。本文将领域适应视为无监督聚类问题,在无需访问源数据的情况下训练目标模型。具体而言,我们提出一种名为“对比与聚类”(CaC)的损失函数:其中正对项促使同一类别的邻域样本在特征空间中相互聚拢以形成聚类,而负对项则推动不同类别的样本相互分离。此外,通过查询记忆库中的最近邻索引,挖掘扩展邻域以获取更有价值的负对样本。在VisDA、Office-Home和Office-31三个通用基准上的大量实验表明,我们的方法达到了当前最优性能。相关代码将开源至https://github.com/yukilulu/CaC。