The quality of a face crop in an image is decided by many factors such as camera resolution, distance, and illumination condition. This makes the discrimination of face images with different qualities a challenging problem in realistic applications. However, most existing approaches are designed specifically for high-quality (HQ) or low-quality (LQ) images, and the performances would degrade for the mixed-quality images. Besides, many methods ask for pre-trained feature extractors or other auxiliary structures to support the training and the evaluation. In this paper, we point out that the key to better understand both the HQ and the LQ images simultaneously is to apply different learning methods according to their qualities. We propose a novel quality-guided joint training approach for mixed-quality face recognition, which could simultaneously learn the images of different qualities with a single encoder. Based on quality partition, classification-based method is employed for HQ data learning. Meanwhile, for the LQ images which lack identity information, we learn them with self-supervised image-image contrastive learning. To effectively catch up the model update and improve the discriminability of contrastive learning in our joint training scenario, we further propose a proxy-updated real-time queue to compose the contrastive pairs with features from the genuine encoder. Experiments on the low-quality datasets SCface and Tinyface, the mixed-quality dataset IJB-B, and five high-quality datasets demonstrate the effectiveness of our proposed approach in recognizing face images of different qualities.
翻译:人脸图像中面部裁剪质量受相机分辨率、距离和光照条件等多种因素影响,这使得不同质量人脸图像的判别成为实际应用中的挑战性问题。然而,现有方法大多针对高质量或低质量图像专门设计,在混合质量图像上的性能会显著下降。此外,许多方法需要预训练特征提取器或其他辅助结构来支持训练与评估。本文指出,同时理解高质量与低质量图像的关键在于根据图像质量采用不同的学习方法。我们提出了一种新型质量引导的混合质量人脸识别联合训练方法,该方法可通过单一编码器同时学习不同质量的图像。基于质量划分,对高质量数据采用基于分类的学习方法;同时,针对缺乏身份信息的低质量图像,采用基于自监督的图像-图像对比学习进行训练。为有效跟踪模型更新并提升联合训练场景中对比学习的判别能力,我们进一步提出代理更新的实时队列,利用真实编码器的特征构建对比对。在低质量数据集SCface和Tinyface、混合质量数据集IJB-B以及五个高质量数据集上的实验表明,所提方法在识别不同质量人脸图像方面具有有效性。