Recent face presentation attack detection (PAD) leverages domain adaptation (DA) and domain generalization (DG) techniques to address performance degradation on unknown domains. However, DA-based PAD methods require access to unlabeled target data, while most DG-based PAD solutions rely on a priori, i.e., known domain labels. Moreover, most DA-/DG-based methods are computationally intensive, demanding complex model architectures and/or multi-stage training processes. This paper proposes to model face PAD as a compound DG task from a causal perspective, linking it to model optimization. We excavate the causal factors hidden in the high-level representation via counterfactual intervention. Moreover, we introduce a class-guided MixStyle to enrich feature-level data distribution within classes instead of focusing on domain information. Both class-guided MixStyle and counterfactual intervention components introduce no extra trainable parameters and negligible computational resources. Extensive cross-dataset and analytic experiments demonstrate the effectiveness and efficiency of our method compared to state-of-the-art PADs. The implementation and the trained weights are publicly available.
翻译:近年来,人脸呈现攻击检测(PAD)利用领域适应(DA)和领域泛化(DG)技术来解决在未知域上性能下降的问题。然而,基于DA的PAD方法需要访问未标记的目标数据,而大多数基于DG的PAD解决方案依赖于先验知识(即已知的领域标签)。此外,大多数基于DA/DG的方法计算开销大,需要复杂的模型架构和/或多阶段训练过程。本文提出从因果视角将人脸PAD建模为一项复合DG任务,并将其与模型优化相关联。我们通过反事实干预挖掘隐藏在高层表征中的因果因素。同时,我们引入类引导的MixStyle来丰富类别内的特征级数据分布,而非关注领域信息。类引导的MixStyle与反事实干预组件均无需引入额外可训练参数,且计算资源消耗可忽略不计。大量跨数据集实验与分析实验表明,与最先进的PAD方法相比,本方法在有效性与效率方面均具优势。本方法的实现代码与训练权重均已公开。