Face Recognition (FR) systems can suffer from physical (i.e., print photo) and digital (i.e., DeepFake) attacks. However, previous related work rarely considers both situations at the same time. This implies the deployment of multiple models and thus more computational burden. The main reasons for this lack of an integrated model are caused by two factors: (1) The lack of a dataset including both physical and digital attacks with ID consistency which means the same ID covers the real face and all attack types; (2) Given the large intra-class variance between these two attacks, it is difficult to learn a compact feature space to detect both attacks simultaneously. To address these issues, we collect a Unified physical-digital Attack dataset, called UniAttackData. The dataset consists of $1,800$ participations of 2 and 12 physical and digital attacks, respectively, resulting in a total of 29,706 videos. Then, we propose a Unified Attack Detection framework based on Vision-Language Models (VLMs), namely UniAttackDetection, which includes three main modules: the Teacher-Student Prompts (TSP) module, focused on acquiring unified and specific knowledge respectively; the Unified Knowledge Mining (UKM) module, designed to capture a comprehensive feature space; and the Sample-Level Prompt Interaction (SLPI) module, aimed at grasping sample-level semantics. These three modules seamlessly form a robust unified attack detection framework. Extensive experiments on UniAttackData and three other datasets demonstrate the superiority of our approach for unified face attack detection.
翻译:人脸识别(FR)系统可能遭受物理攻击(如打印照片)和数字攻击(如DeepFake)。然而,以往相关工作很少同时考虑这两种情况,这意味着需要部署多个模型,从而增加计算负担。缺乏统一模型的主要原因有两个:(1)缺乏同时包含物理和数字攻击且具有身份一致性(即同一身份覆盖真实人脸及所有攻击类型)的数据集;(2)由于这两种攻击的类内方差较大,难以学习紧凑的特征空间来同时检测两者。为解决这些问题,我们收集了一个统一的物理-数字攻击数据集,称为UniAttackData。该数据集包含1800名参与者,分别涉及2种物理攻击和12种数字攻击,总计29,706个视频。随后,我们提出了一种基于视觉-语言模型(VLM)的统一攻击检测框架,即UniAttackDetection,其中包含三个主要模块:教师-学生提示(TSP)模块,分别专注于获取统一知识和特定知识;统一知识挖掘(UKM)模块,旨在捕获全面的特征空间;以及样本级提示交互(SLPI)模块,着眼于掌握样本级语义。这三个模块无缝结合,形成了一个鲁棒的统一攻击检测框架。在UniAttackData及其他三个数据集上的大量实验证明了我们的方法在统一人脸攻击检测中的优越性。