Despite the success of deep-learning models in many tasks, there have been concerns about such models learning shortcuts, and their lack of robustness to irrelevant confounders. When it comes to models directly trained on human faces, a sensitive confounder is that of human identities. Many face-related tasks should ideally be identity-independent, and perform uniformly across different individuals (i.e. be fair). One way to measure and enforce such robustness and performance uniformity is through enforcing it during training, assuming identity-related information is available at scale. However, due to privacy concerns and also the cost of collecting such information, this is often not the case, and most face datasets simply contain input images and their corresponding task-related labels. Thus, improving identity-related robustness without the need for such annotations is of great importance. Here, we explore using face-recognition embedding vectors, as proxies for identities, to enforce such robustness. We propose to use the structure in the face-recognition embedding space, to implicitly emphasize rare samples within each class. We do so by weighting samples according to their conditional inverse density (CID) in the proxy embedding space. Our experiments suggest that such a simple sample weighting scheme, not only improves the training robustness, it often improves the overall performance as a result of such robustness. We also show that employing such constraints during training results in models that are significantly less sensitive to different levels of bias in the dataset.
翻译:尽管深度学习模型在许多任务中取得了成功,但人们一直担忧这类模型可能学习捷径,且对无关混杂因素缺乏鲁棒性。当涉及直接基于人脸训练的模型时,一个敏感的混杂因素是人类身份。许多与人脸相关的任务理想情况下应独立于身份,并在不同个体间表现一致(即具有公平性)。一种衡量并实现这种鲁棒性与性能一致性的方法是在训练过程中施加约束,前提是身份相关信息能够大规模获取。然而,由于隐私顾虑以及收集此类信息的成本,实际情况往往并非如此——大多数面部数据集仅包含输入图像及其对应的任务标签。因此,无需此类标注即可提升身份相关鲁棒性具有重要意义。本文探索利用人脸识别嵌入向量作为身份代理,来施加此类鲁棒性约束。我们提出利用人脸识别嵌入空间的结构,隐式强化每个类别中的稀有样本,具体方法是通过代理嵌入空间中的条件逆密度(CID)对样本进行加权。实验表明,这种简单的样本加权方案不仅提升了训练鲁棒性,还常因鲁棒性增强而带来整体性能改善。同时,采用此类约束进行训练得到的模型,对数据集中不同偏差水平的敏感度显著降低。