Unsupervised domain adaptation (UDA) methods facilitate the transfer of models to target domains without labels. However, these methods necessitate a labeled target validation set for hyper-parameter tuning and model selection. In this paper, we aim to find an evaluation metric capable of assessing the quality of a transferred model without access to target validation labels. We begin with the metric based on mutual information of the model prediction. Through empirical analysis, we identify three prevalent issues with this metric: 1) It does not account for the source structure. 2) It can be easily attacked. 3) It fails to detect negative transfer caused by the over-alignment of source and target features. To address the first two issues, we incorporate source accuracy into the metric and employ a new MLP classifier that is held out during training, significantly improving the result. To tackle the final issue, we integrate this enhanced metric with data augmentation, resulting in a novel unsupervised UDA metric called the Augmentation Consistency Metric (ACM). Additionally, we empirically demonstrate the shortcomings of previous experiment settings and conduct large-scale experiments to validate the effectiveness of our proposed metric. Furthermore, we employ our metric to automatically search for the optimal hyper-parameter set, achieving superior performance compared to manually tuned sets across four common benchmarks. Codes will be available soon.
翻译:无监督领域适应(UDA)方法能够在无标签条件下将模型迁移至目标领域。然而,这些方法需要带标签的目标验证集进行超参数调优和模型选择。本文旨在寻找一种无需目标验证标签即可评估迁移模型质量的评估指标。我们从基于模型预测互信息的指标入手,通过实证分析发现该指标存在三个常见问题:1)未考虑源域结构;2)易受攻击;3)无法检测因源域与目标域特征过度对齐导致的负迁移。针对前两个问题,我们将源域准确率纳入指标,并采用训练期间保持固定的新型MLP分类器,显著改善了结果。针对第三个问题,我们将改进后的指标与数据增强相结合,提出了一种名为一致性增强度量(ACM)的新型无监督UDA指标。此外,我们通过实验论证了既往实验设置的不足,并开展大规模实验验证所提指标的有效性。进一步地,我们利用该指标自动搜索最优超参数集,在四个通用基准上取得了优于人工调参的性能。代码即将开源。