The need for more transparent face recognition (FR), along with other visual-based decision-making systems has recently attracted more attention in research, society, and industry. The reasons why two face images are matched or not matched by a deep learning-based face recognition system are not obvious due to the high number of parameters and the complexity of the models. However, it is important for users, operators, and developers to ensure trust and accountability of the system and to analyze drawbacks such as biased behavior. While many previous works use spatial semantic maps to highlight the regions that have a significant influence on the decision of the face recognition system, frequency components which are also considered by CNNs, are neglected. In this work, we take a step forward and investigate explainable face recognition in the unexplored frequency domain. This makes this work the first to propose explainability of verification-based decisions in the frequency domain, thus explaining the relative influence of the frequency components of each input toward the obtained outcome. To achieve this, we manipulate face images in the spatial frequency domain and investigate the impact on verification outcomes. In extensive quantitative experiments, along with investigating two special scenarios cases, cross-resolution FR and morphing attacks (the latter in supplementary material), we observe the applicability of our proposed frequency-based explanations.
翻译:近年来,对更透明的人脸识别系统以及其他基于视觉的决策系统的需求,在研究、社会和工业界引起了越来越多的关注。由于深度学习人脸识别系统参数数量庞大且模型复杂,两张人脸图像被匹配或不匹配的原因并不显而易见。然而,对于用户、操作者和开发者而言,确保系统的可信度与可问责性,并分析其缺陷(如偏见行为)至关重要。尽管以往许多研究使用空间语义图来突显对人脸识别系统决策有显著影响的区域,但卷积神经网络同样会考虑的频率分量却被忽视了。在本工作中,我们向前迈进一步,在尚未被探索的频域中研究可解释的人脸识别。这使得本工作成为首个在频域中提出基于验证决策的可解释性的研究,从而解释每个输入图像的频率分量对最终结果的相对影响。为实现这一目标,我们在空间频域中对人脸图像进行处理,并研究其对验证结果的影响。通过大量的定量实验,并结合对两种特殊场景案例(跨分辨率人脸识别与换脸攻击,后者见补充材料)的考察,我们验证了所提出的基于频率的解释方法的适用性。