Domain Adaptation (DA) is always challenged by the spurious correlation between domain-invariant features (e.g., class identity) and domain-specific features (e.g., environment) that does not generalize to the target domain. Unfortunately, even enriched with additional unsupervised target domains, existing Unsupervised DA (UDA) methods still suffer from it. This is because the source domain supervision only considers the target domain samples as auxiliary data (e.g., by pseudo-labeling), yet the inherent distribution in the target domain -- where the valuable de-correlation clues hide -- is disregarded. We propose to make the U in UDA matter by giving equal status to the two domains. Specifically, we learn an invariant classifier whose prediction is simultaneously consistent with the labels in the source domain and clusters in the target domain, hence the spurious correlation inconsistent in the target domain is removed. We dub our approach "Invariant CONsistency learning" (ICON). Extensive experiments show that ICON achieves the state-of-the-art performance on the classic UDA benchmarks: Office-Home and VisDA-2017, and outperforms all the conventional methods on the challenging WILDS 2.0 benchmark. Codes are in https://github.com/yue-zhongqi/ICON.
翻译:域适应(Domain Adaptation, DA)始终面临域不变特征(如类别身份)与域特定特征(如环境)之间虚假相关性的挑战,这种相关性无法泛化到目标域。遗憾的是,即便借助额外的无监督目标域数据,现有无监督域适应(Unsupervised DA, UDA)方法仍受此困扰。这是因为源域监督仅将目标域样本视为辅助数据(例如通过伪标签标注),但忽视了目标域中蕴含的宝贵去相关线索的固有分布。我们主张通过赋予两个域同等地位来让UDA中的U发挥作用。具体而言,我们学习一个不变分类器,其预测结果同时与源域中的标签和目标域中的聚类保持一致,从而消除目标域中不一致的虚假相关性。我们将该方法称为"不变一致性学习"(Invariant CONsistency learning, ICON)。大量实验表明,ICON在经典UDA基准测试Office-Home和VisDA-2017上取得了最先进的性能,并在具有挑战性的WILDS 2.0基准测试中优于所有传统方法。代码见https://github.com/yue-zhongqi/ICON。