Reasoning about a model's accuracy on a test sample from its confidence is a central problem in machine learning, being connected to important applications such as uncertainty representation, model selection, and exploration. While these connections have been well-studied in the i.i.d. settings, distribution shifts pose significant challenges to the traditional methods. Therefore, model calibration and model selection remain challenging in the unsupervised domain adaptation problem--a scenario where the goal is to perform well in a distribution shifted domain without labels. In this work, we tackle difficulties coming from distribution shifts by developing a novel importance weighted group accuracy estimator. Specifically, we formulate an optimization problem for finding an importance weight that leads to an accurate group accuracy estimation in the distribution shifted domain with theoretical analyses. Extensive experiments show the effectiveness of group accuracy estimation on model calibration and model selection. Our results emphasize the significance of group accuracy estimation for addressing challenges in unsupervised domain adaptation, as an orthogonal improvement direction with improving transferability of accuracy.
翻译:从模型对测试样本的置信度推断其准确率是机器学习中的核心问题,与不确定性表示、模型选择和探索等重要应用密切相关。尽管这些联系已在独立同分布(i.i.d.)场景中得到充分研究,但分布偏移对传统方法带来了显著挑战。因此,模型校准和模型选择在无监督域自适应问题中仍具挑战性——该场景的目标是在无标签的分布偏移域中取得良好性能。本研究通过开发一种新型重要性加权组准确率估计器来应对分布偏移带来的困难。具体而言,我们构建了一个优化问题以寻求能对分布偏移域实现精确组准确率估计的重要性权重,并提供了理论分析。大量实验表明,组准确率估计在模型校准和模型选择中具有显著有效性。我们的结果强调了组准确率估计对于解决无监督域自适应挑战的重要性,它可作为提升准确率可迁移性的正交改进方向。