A secure fingerprint recognition system must contain both a presentation attack (i.e., spoof) detection and recognition module in order to protect users against unwanted access by malicious users. Traditionally, these tasks would be carried out by two independent systems; however, recent studies have demonstrated the potential to have one unified system architecture in order to reduce the computational burdens on the system, while maintaining high accuracy. In this work, we leverage a vision transformer architecture for joint spoof detection and matching and report competitive results with state-of-the-art (SOTA) models for both a sequential system (two ViT models operating independently) and a unified architecture (a single ViT model for both tasks). ViT models are particularly well suited for this task as the ViT's global embedding encodes features useful for recognition, whereas the individual, local embeddings are useful for spoof detection. We demonstrate the capability of our unified model to achieve an average integrated matching (IM) accuracy of 98.87% across LivDet 2013 and 2015 CrossMatch sensors. This is comparable to IM accuracy of 98.95% of our sequential dual-ViT system, but with ~50% of the parameters and ~58% of the latency.
翻译:安全的指纹识别系统必须同时具备呈现攻击(即欺骗)检测与识别模块,以保护用户免受恶意用户的不当访问。传统上,这两项任务由两个独立系统完成;然而,近期研究表明,采用统一系统架构具有在保持高精度的同时降低系统计算负担的潜力。本文采用视觉变换器架构实现欺骗检测与匹配的联合处理,并报告了与现有最优模型相比具有竞争力的结果,涵盖顺序系统(两个ViT模型独立运行)和统一架构(单个ViT模型同时处理两项任务)。ViT模型尤其适合此任务,因其全局嵌入可编码适用于识别的特征,而独立局部嵌入则适用于欺骗检测。我们证明统一模型在LivDet 2013和2015 CrossMatch传感器上平均综合匹配精度达到98.87%,与顺序双ViT系统的98.95%综合匹配精度相当,但参数量减少约50%,延迟降低约58%。