Lack of generalization to unseen domains/attacks is the Achilles heel of most face presentation attack detection (FacePAD) algorithms. Existing attempts to enhance the generalizability of FacePAD solutions assume that data from multiple source domains are available with a single entity to enable centralized training. In practice, data from different source domains may be collected by diverse entities, who are often unable to share their data due to legal and privacy constraints. While collaborative learning paradigms such as federated learning (FL) can overcome this problem, standard FL methods are ill-suited for domain generalization because they struggle to surmount the twin challenges of handling non-iid client data distributions during training and generalizing to unseen domains during inference. In this work, a novel framework called Federated Split learning with Intermediate representation Sampling (FedSIS) is introduced for privacy-preserving domain generalization. In FedSIS, a hybrid Vision Transformer (ViT) architecture is learned using a combination of FL and split learning to achieve robustness against statistical heterogeneity in the client data distributions without any sharing of raw data (thereby preserving privacy). To further improve generalization to unseen domains, a novel feature augmentation strategy called intermediate representation sampling is employed, and discriminative information from intermediate blocks of a ViT is distilled using a shared adapter network. The FedSIS approach has been evaluated on two well-known benchmarks for cross-domain FacePAD to demonstrate that it is possible to achieve state-of-the-art generalization performance without data sharing. Code: https://github.com/Naiftt/FedSIS
翻译:缺乏对未知域/攻击的泛化能力是大多数人脸呈现攻击检测(FacePAD)算法的致命弱点。现有提升FacePAD方案泛化性的尝试假设单一实体可获取多个源域数据以实现集中式训练。然而实践中,不同源域的数据可能由不同实体收集,这些实体常因法律与隐私限制无法共享数据。尽管联邦学习(FL)等协作学习范式能克服此问题,但标准FL方法因难以同时应对训练时非独立同分布客户端数据分布与推理时对未知域的泛化两大挑战,并不适合域泛化任务。本文提出了一种名为"联邦拆分学习与中间表示采样"(FedSIS)的新型框架,用于隐私保护域泛化。FedSIS中,通过结合FL与拆分学习训练混合视觉Transformer(ViT)架构,既无需共享原始数据(从而保护隐私),又实现了对客户端数据分布统计异质性的鲁棒性。为进一步改善对未知域的泛化,采用中间表示采样这一新颖特征增强策略,并通过共享适配器网络提取ViT中间块的判别性信息。FedSIS方法已在两个知名的跨域FacePAD基准上评估,证明了无需数据共享即可实现最先进的泛化性能。代码:https://github.com/Naiftt/FedSIS