A fundamental tenet of pattern recognition is that overlap between training and testing sets causes an optimistic accuracy estimate. Deep CNNs for face recognition are trained for N-way classification of the identities in the training set. Accuracy is commonly estimated as average 10-fold classification accuracy on image pairs from test sets such as LFW, CALFW, CPLFW, CFP-FP and AgeDB-30. Because train and test sets have been independently assembled, images and identities in any given test set may also be present in any given training set. In particular, our experiments reveal a surprising degree of identity and image overlap between the LFW family of test sets and the MS1MV2 training set. Our experiments also reveal identity label noise in MS1MV2. We compare accuracy achieved with same-size MS1MV2 subsets that are identity-disjoint and not identity-disjoint with LFW, to reveal the size of the optimistic bias. Using more challenging test sets from the LFW family, we find that the size of the optimistic bias is larger for more challenging test sets. Our results highlight the lack of and the need for identity-disjoint train and test methodology in face recognition research.
翻译:模式识别的基本原则是训练集与测试集之间的重叠会导致准确率评估过于乐观。用于人脸识别的深度卷积神经网络通过对训练集中身份类别进行N-way分类训练。准确率通常通过测试集(如LFW、CALFW、CPLFW、CFP-FP和AgeDB-30)中图像对的平均10折分类精度来评估。由于训练集和测试集是独立构建的,任何测试集中的图像和身份都可能出现在任意训练集中。特别地,我们的实验揭示了LFW系列测试集与MS1MV2训练集之间存在惊人的身份和图像重叠程度。实验还发现MS1MV2中存在身份标签噪声。我们比较了与LFW身份不相交和与LFW身份相交的相同尺寸MS1MV2子集所达到的精度,揭示了乐观偏差的规模。使用LFW系列中更具挑战性的测试集时,我们发现乐观偏差的规模在更困难的测试集中更大。我们的结果凸显了人脸识别研究中缺乏且亟需身份不相交的训练/测试方法。