One-Shot Learning for Periocular Recognition: Exploring the Effect of Domain Adaptation and Data Bias on Deep Representations

One weakness of machine-learning algorithms is the need to train the models for a new task. This presents a specific challenge for biometric recognition due to the dynamic nature of databases and, in some instances, the reliance on subject collaboration for data collection. In this paper, we investigate the behavior of deep representations in widely used CNN models under extreme data scarcity for One-Shot periocular recognition, a biometric recognition task. We analyze the outputs of CNN layers as identity-representing feature vectors. We examine the impact of Domain Adaptation on the network layers' output for unseen data and evaluate the method's robustness concerning data normalization and generalization of the best-performing layer. We improved state-of-the-art results that made use of networks trained with biometric datasets with millions of images and fine-tuned for the target periocular dataset by utilizing out-of-the-box CNNs trained for the ImageNet Recognition Challenge and standard computer vision algorithms. For example, for the Cross-Eyed dataset, we could reduce the EER by 67% and 79% (from 1.70% and 3.41% to 0.56% and 0.71%) in the Close-World and Open-World protocols, respectively, for the periocular case. We also demonstrate that traditional algorithms like SIFT can outperform CNNs in situations with limited data or scenarios where the network has not been trained with the test classes like the Open-World mode. SIFT alone was able to reduce the EER by 64% and 71.6% (from 1.7% and 3.41% to 0.6% and 0.97%) for Cross-Eyed in the Close-World and Open-World protocols, respectively, and a reduction of 4.6% (from 3.94% to 3.76%) in the PolyU database for the Open-World and single biometric case.

翻译：机器学习算法的一个弱点是每次需要为新任务重新训练模型。这对生物特征识别提出了特殊挑战，因为数据库具有动态特性，且在某些情况下数据收集依赖于受试者的协作。本文研究在极端数据稀缺条件下，广泛使用的CNN模型深度表征在单样本眼周识别任务中的表现。我们将CNN各层输出作为身份表征的特征向量进行分析。我们考察了域适应对网络隐藏层输出在未见数据上的影响，并评估了该方法在数据标准化和最优层泛化方面的鲁棒性。通过直接采用为ImageNet识别挑战赛训练的现成CNN模型和标准计算机视觉算法，我们改进了当前最优结果——这些基线方法使用包含数百万图像的生物特征数据集训练网络，并针对目标眼周数据集进行微调。例如，在Cross-Eyed数据集上，对于眼周识别场景的封闭世界协议和开放世界协议，我们的方法分别将等错误率降低了67%和79%（从1.70%和3.41%降至0.56%和0.71%）。同时实验证明，在数据受限或网络未包含测试类别训练的场景（如开放世界模式）下，SIFT等传统算法可超越CNN表现。仅使用SIFT算法，在Cross-Eyed数据集的封闭世界和开放世界协议中分别将等错误率降低了64%和71.6%（从1.7%和3.41%降至0.6%和0.97%），并在PolyU数据集的开放世界单样本识别场景下降低了4.6%（从3.94%降至3.76%）。