Many applications utilize sensors in mobile devices and machine learning to provide novel services. However, various factors such as different users, devices, and environments impact the performance of such applications, thus making the domain shift (i.e., distributional shift between the training domain and the target domain) a critical issue in mobile sensing. Despite attempts in domain adaptation to solve this challenging problem, their performance is unreliable due to the complex interplay among diverse factors. In principle, the performance uncertainty can be identified and redeemed by performance validation with ground-truth labels. However, it is infeasible for every user to collect high-quality, sufficient labeled data. To address the issue, we present DAPPER (Domain AdaPtation Performance EstimatoR) that estimates the adaptation performance in a target domain with only unlabeled target data. Our key idea is to approximate the model performance based on the mutual information between the model inputs and corresponding outputs. Our evaluation with four real-world sensing datasets compared against six baselines shows that on average, DAPPER outperforms the state-of-the-art baseline by 39.8% in estimation accuracy. Moreover, our on-device experiment shows that DAPPER achieves up to 396X less computation overhead compared with the baselines.
翻译:许多应用利用移动设备中的传感器和机器学习来提供新型服务。然而,不同用户、设备和环境等多种因素会影响此类应用的性能,使得领域偏移(即训练域与目标域之间的分布偏移)成为移动感知中的关键问题。尽管领域自适应方法试图解决这一挑战,但由于多种因素间的复杂相互作用,其性能并不可靠。原则上,性能不确定性可以通过基于真实标签的性能验证来识别和弥补。然而,让每个用户收集高质量且充足的标签数据并不可行。为解决这一问题,我们提出了DAPPER(领域自适应性能估计器),它仅利用目标域的无标签数据即可估计自适应性能。我们的核心思想是基于模型输入与相应输出之间的互信息来近似模型性能。在四个真实传感数据集上与六种基线方法的评估表明,DAPPER在估计精度上平均比当前最优基线高出39.8%。此外,我们的设备端实验显示,与基线方法相比,DAPPER的计算开销最多可降低396倍。