Although deep learning are commonly employed for image recognition, usually huge amount of labeled training data is required, which may not always be readily available. This leads to a noticeable performance disparity when compared to state-of-the-art unsupervised face verification techniques. In this work, we propose a method to narrow this gap by leveraging an autoencoder to convert the face image vector into a novel representation. Notably, the autoencoder is trained to reconstruct neighboring face image vectors rather than the original input image vectors. These neighbor face image vectors are chosen through an unsupervised process based on the highest cosine scores with the training face image vectors. The proposed method achieves a relative improvement of 56\% in terms of EER over the baseline system on Labeled Faces in the Wild (LFW) dataset. This has successfully narrowed down the performance gap between cosine and PLDA scoring systems.
翻译:尽管深度学习常用于图像识别,但通常需要大量标注训练数据,而这些数据并不总是容易获取的。这导致与最先进的无监督人脸验证技术相比存在明显的性能差距。在这项工作中,我们提出了一种缩小这一差距的方法,通过利用自编码器将人脸图像向量转换为一种新颖的表示形式。值得注意的是,自编码器被训练用于重建相邻人脸图像向量,而非原始输入图像向量。这些相邻人脸图像向量是通过无监督过程选择的,该过程基于与训练人脸图像向量的最高余弦分数。所提出的方法在Labeled Faces in the Wild(LFW)数据集上实现了相对于基线系统56%的等错误率相对改进。这成功缩小了余弦打分系统和PLDA打分系统之间的性能差距。