TransFIRA: Transfer Learning for Face Image Recognizability Assessment

Face recognition in unconstrained environments such as surveillance, video, and web imagery must contend with extreme variation in pose, blur, illumination, and occlusion, where conventional visual quality metrics fail to predict whether inputs are truly recognizable to the deployed encoder. Existing FIQA methods typically rely on visual heuristics, curated annotations, or computationally intensive generative pipelines, leaving their predictions detached from the encoder's decision geometry. We introduce TransFIRA (Transfer Learning for Face Image Recognizability Assessment), a lightweight and annotation-free framework that grounds recognizability directly in embedding space. TransFIRA delivers three advances: (i) a definition of recognizability via class-center similarity (CCS) and class-center angular separation (CCAS), yielding the first natural, decision-boundary-aligned criterion for filtering and weighting; (ii) a recognizability-informed aggregation strategy that achieves state-of-the-art verification accuracy on BRIAR and IJB-C while nearly doubling correlation with true recognizability, all without external labels, heuristics, or backbone-specific training; and (iii) new extensions beyond faces, including encoder-grounded explainability that reveals how degradations and subject-specific factors affect recognizability, and the first method for body recognizability assessment. Experiments confirm state-of-the-art results on faces, strong performance on body recognition, and robustness under cross-dataset shifts and out-of-distribution evaluation. Together, these contributions establish TransFIRA as a unified, geometry-driven framework for recognizability assessment that is encoder-specific, accurate, interpretable, and extensible across modalities, significantly advancing FIQA in accuracy, explainability, and scope.

翻译：在监控、视频和网络图像等非约束环境下的面部识别，必须应对姿态、模糊、光照和遮挡的极端变化，而传统视觉质量指标无法预测输入图像是否真正可被部署的编码器识别。现有面部图像质量评估方法通常依赖视觉启发式规则、人工标注或计算密集的生成式流程，导致其预测结果与编码器的决策几何结构脱节。我们提出TransFIRA（面向人脸图像可识别性评估的迁移学习），这是一种轻量级且无需标注的框架，直接在嵌入空间中建立可识别性的基础。TransFIRA实现三项突破：（i）通过类中心相似度和类中心角度分离度定义可识别性，首次提出与决策边界对齐的自然准则用于过滤与加权；（ii）提出基于可识别性感知的聚合策略，在BRIAR和IJB-C数据集上实现最先进的验证精度，同时与真实可识别性的相关性近乎翻倍，且无需外部标注、启发式规则或骨干网络特定训练；（iii）实现人脸之外的扩展应用，包括揭示退化因素和主体特定因素如何影响可识别性的编码器可解释性方法，以及首个身体可识别性评估方法。实验验证了该方法在人脸识别上的最优性能、身体识别上的优异表现，以及跨数据集迁移和分布外评估下的稳健性。这些贡献共同确立了TransFIRA作为统一、几何驱动的可识别性评估框架的地位，其具有编码器特异性、准确性、可解释性和跨模态可扩展性，在精度、可解释性和应用范围上显著推进了面部图像质量评估领域。