Although deep learning are commonly employed for image recognition, usually huge amount of labeled training data is required, which may not always be readily available. This leads to a noticeable performance disparity when compared to state-of-the-art unsupervised face verification techniques. In this work, we propose a method to narrow this gap by leveraging an autoencoder to convert the face image vector into a novel representation. Notably, the autoencoder is trained to reconstruct neighboring face image vectors rather than the original input image vectors. These neighbor face image vectors are chosen through an unsupervised process based on the highest cosine scores with the training face image vectors. The proposed method achieves a relative improvement of 56\% in terms of EER over the baseline system on Labeled Faces in the Wild (LFW) dataset. This has successfully narrowed down the performance gap between cosine and PLDA scoring systems.
翻译:尽管深度学习常被用于图像识别,但通常需要大量标注训练数据,而这类数据并非总是容易获取。这导致其与当前最先进的无监督人脸验证技术相比存在明显的性能差距。本研究提出一种通过自编码器将人脸图像向量转化为新表征的方法以缩小这一差距。值得注意的是,该自编码器的训练目标并非重构原始输入图像向量,而是重构相邻人脸图像向量。这些相邻人脸图像向量基于训练人脸图像向量的最高余弦相似度分数通过无监督过程选取。所提方法在LFW数据集上的等错误率相对基线系统提升56%,成功缩小了余弦评分系统与PLDA评分系统之间的性能差距。