A False Sense of Privacy: Towards a Reliable Evaluation Methodology for the Anonymization of Biometric Data

Biometric data contains distinctive human traits such as facial features or gait patterns. The use of biometric data permits an individuation so exact that the data is utilized effectively in identification and authentication systems. But for this same reason, privacy protections become indispensably necessary. Privacy protection is extensively afforded by the technique of anonymization. Anonymization techniques obfuscate or remove the sensitive personal data to achieve high levels of anonymity. However, the effectiveness of anonymization relies, in equal parts, on the effectiveness of the methods employed to evaluate anonymization performance. In this paper, we assess the state-of-the-art methods used to evaluate the performance of anonymization techniques for facial images and gait patterns. We demonstrate that the state-of-the-art evaluation methods have serious and frequent shortcomings. In particular, we find that the underlying assumptions of the state-of-the-art are quite unwarranted. When a method evaluating the performance of anonymization assumes a weak adversary or a weak recognition scenario, then the resulting evaluation will very likely be a gross overestimation of the anonymization performance. Therefore, we propose a stronger adversary model which is alert to the recognition scenario as well as to the anonymization scenario. Our adversary model implements an appropriate measure of anonymization performance. We improve the selection process for the evaluation dataset, and we reduce the numbers of identities contained in the dataset while ensuring that these identities remain easily distinguishable from one another. Our novel evaluation methodology surpasses the state-of-the-art because we measure worst-case performance and so deliver a highly reliable evaluation of biometric anonymization techniques.

翻译：生物特征数据包含独特的人类特征，如面部特征或步态模式。利用生物特征数据可实现如此精确的个体识别，以至于这些数据在身份识别与认证系统中得到有效应用。但正因如此，隐私保护变得不可或缺。隐私保护主要通过匿名化技术实现。匿名化技术通过混淆或移除敏感个人数据来实现高水平的匿名性。然而，匿名化的有效性在同等程度上取决于用于评估匿名化性能的方法的有效性。本文评估了当前用于评估面部图像和步态模式匿名化技术性能的先进方法。我们证明这些评估方法存在严重且频繁的缺陷。特别是，我们发现当前先进方法的基本假设相当缺乏依据。当评估匿名化性能的方法假设攻击者能力较弱或识别场景较简单时，其评估结果很可能严重高估匿名化性能。为此，我们提出一种更强的攻击者模型，该模型既能感知识别场景，也能感知匿名化场景。我们的攻击者模型实现了合适的匿名化性能度量。我们改进了评估数据集的选择流程，在保持数据集内身份易于相互区分的同时减少了身份数量。由于我们评估的是最差情况下的性能，因此这种新颖的评估方法超越了现有技术，为生物特征匿名化技术提供了高度可靠的评估。