Matching algorithms are commonly used to predict matches between items in a collection. For example, in 1:1 face verification, a matching algorithm predicts whether two face images depict the same person. Accurately assessing the uncertainty of the error rates of such algorithms can be challenging when data are dependent and error rates are low, two aspects that have been often overlooked in the literature. In this work, we review methods for constructing confidence intervals for error rates in matching tasks such as 1:1 face verification. We derive and examine the statistical properties of these methods and demonstrate how coverage and interval width vary with sample size, error rates, and degree of data dependence using both synthetic and real-world datasets. Based on our findings, we provide recommendations for best practices for constructing confidence intervals for error rates in matching tasks.
翻译:匹配算法常被用于预测集合中项目之间的匹配关系。例如,在1:1人脸验证中,匹配算法预测两张人脸图像是否属于同一人。当数据存在依赖性且错误率较低(这两个方面在文献中常被忽视)时,准确评估此类算法错误率的不确定性可能具有挑战性。本研究系统回顾了匹配任务(如1:1人脸验证)中构建错误率置信区间的方法。我们推导并检验了这些方法的统计特性,并通过合成数据集和真实数据集,展示了覆盖概率和区间宽度如何随样本量、错误率及数据依赖程度变化。基于研究结果,我们针对匹配任务中错误率置信区间的最佳实践提出了建议。