Considerations on the Evaluation of Biometric Quality Assessment Algorithms

Quality assessment algorithms can be used to estimate the utility of a biometric sample for the purpose of biometric recognition. "Error versus Discard Characteristic" (EDC) plots, and "partial Area Under Curve" (pAUC) values of curves therein, are generally used by researchers to evaluate the predictive performance of such quality assessment algorithms. An EDC curve depends on an error type such as the "False Non Match Rate" (FNMR), a quality assessment algorithm, a biometric recognition system, a set of comparisons each corresponding to a biometric sample pair, and a comparison score threshold corresponding to a starting error. To compute an EDC curve, comparisons are progressively discarded based on the associated samples' lowest quality scores, and the error is computed for the remaining comparisons. Additionally, a discard fraction limit or range must be selected to compute pAUC values, which can then be used to quantitatively rank quality assessment algorithms. This paper discusses and analyses various details for this kind of quality assessment algorithm evaluation, including general EDC properties, interpretability improvements for pAUC values based on a hard lower error limit and a soft upper error limit, the use of relative instead of discrete rankings, stepwise vs. linear curve interpolation, and normalisation of quality scores to a [0, 100] integer range. We also analyse the stability of quantitative quality assessment algorithm rankings based on pAUC values across varying pAUC discard fraction limits and starting errors, concluding that higher pAUC discard fraction limits should be preferred. The analyses are conducted both with synthetic data and with real data for a face image quality assessment scenario, with a focus on general modality-independent conclusions for EDC evaluations.

翻译：质量评估算法可用于估计生物特征样本在生物特征识别中的效用。研究人员通常使用"错误率-丢弃特征"(EDC)图及其曲线下的"部分面积"(pAUC)值来评价此类质量评估算法的预测性能。EDC曲线依赖于错误类型（如"误匹配率"(FNMR)）、质量评估算法、生物特征识别系统、每组对应一对生物特征样本的比较操作，以及与初始错误率对应的比较分数阈值。为计算EDC曲线，需根据关联样本的最低质量评分逐步丢弃比较操作，并对剩余比较操作计算错误率。此外，必须选择丢弃比例限制或范围以计算pAUC值，进而用于对质量评估算法进行定量排序。本文讨论并分析了此类质量评估算法评价中的诸多细节，包括EDC的一般特性、基于硬下界错误限制和软上界错误限制的pAUC值可解释性改进、使用相对排序替代离散排序、逐步插值与线性插值的对比，以及将质量评分归一化至[0, 100]整数范围的方法。我们还分析了基于pAUC值的定量质量评估算法排序在不同pAUC丢弃比例限制和初始错误率下的稳定性，得出应优先选择更高pAUC丢弃比例限制的结论。分析过程分别基于合成数据和真实数据在面部图像质量评估场景中进行，重点侧重于与模态无关的通用EDC评价结论。