A crucial assumption underlying the most current theory of machine learning is that the training distribution is identical to the testing distribution. However, this assumption may not hold in some real-world applications. In this paper, we propose an importance sampling based data variation robust loss (ISloss) for learning problems which minimizes the worst case of loss under the constraint of distribution deviation. The distribution deviation constraint can be converted to the constraint over a set of weight distributions centered on the uniform distribution derived from the importance sampling method. Furthermore, we reveal that there is a relationship between ISloss under the logarithmic transformation (LogISloss) and the p-norm loss. We apply the proposed LogISloss to the face verification problem on Racial Faces in the Wild dataset and show that the proposed method is robust under large distribution deviations.
翻译:当前机器学习理论中最关键的一个假设是训练分布与测试分布相同。然而,在部分实际应用中这一假设可能不成立。本文针对学习问题提出一种基于重要性采样的数据变化鲁棒损失函数(ISloss),该函数在分布偏差约束下最小化最坏情况损失。通过重要性采样方法,分布偏差约束可转化为以均匀分布为中心的权重分布集合上的约束。此外,我们揭示了经对数变换的ISloss(LogISloss)与p范数损失之间的关联性。将所提LogISloss应用于Racial Faces in the Wild数据集的人脸验证问题,实验结果表明该方法在较大分布偏差下具有鲁棒性。