Modern face recognition models achieve high overall accuracy but continue to exhibit systematic biases that disproportionately affect certain subpopulations. Conventional bias evaluation frameworks rely on labeled attributes to form subpopulations, which are expensive to obtain and limited to predefined categories. We introduce Latent Feature Alignment (LFA), an attribute-label-free algorithm that uses latent directions to identify subpopulations. This yields two main benefits over standard clustering: (i) semantically coherent grouping, where faces sharing common attributes are grouped together more reliably than by proximity-based methods, and (ii) discovery of interpretable directions, which correspond to semantic attributes such as age, ethnicity, or attire. Across four state-of-the-art recognition models (ArcFace, CosFace, ElasticFace, PartialFC) and two benchmarks (RFW, CelebA), LFA consistently outperforms k-means and nearest-neighbor search in intra-group semantic coherence, while uncovering interpretable latent directions aligned with demographic and contextual attributes. These results position LFA as a practical method for representation auditing of face recognition models, enabling practitioners to identify and interpret biased subpopulations without predefined attribute annotations.
翻译:现代人脸识别模型虽整体精度较高,但仍存在系统性偏差,对特定子群体造成不成比例的影响。传统偏差评估框架依赖标注属性构建子群体,此类标注成本高昂且受限于预定义类别。本文提出潜在特征对齐算法,这是一种无需属性标注的方法,通过潜在方向识别子群体。相较于标准聚类方法,该算法具有两大优势:(一)语义连贯的分组——相较于基于邻近度的方法,本方法能更可靠地将具有共同属性的人脸归为一组;(二)可解释方向的发现——这些方向对应年龄、种族、着装等语义属性。在四种前沿识别模型(ArcFace、CosFace、ElasticFace、PartialFC)和两个基准数据集(RFW、CelebA)上的实验表明,LFA在组内语义连贯性方面持续优于k均值与最近邻搜索,同时能发现与人口统计及上下文属性对齐的可解释潜在方向。这些结果确立了LFA作为人脸识别模型表征审计的实用方法,使从业者无需预定义属性标注即可识别并解读存在偏差的子群体。