This study utilizes the advanced capabilities of the GPT-4 multimodal Large Language Model (LLM) to explore its potential in iris recognition - a field less common and more specialized than face recognition. By focusing on this niche yet crucial area, we investigate how well AI tools like ChatGPT can understand and analyze iris images. Through a series of meticulously designed experiments employing a zero-shot learning approach, the capabilities of ChatGPT-4 was assessed across various challenging conditions including diverse datasets, presentation attacks, occlusions such as glasses, and other real-world variations. The findings convey ChatGPT-4's remarkable adaptability and precision, revealing its proficiency in identifying distinctive iris features, while also detecting subtle effects like makeup on iris recognition. A comparative analysis with Gemini Advanced - Google's AI model - highlighted ChatGPT-4's better performance and user experience in complex iris analysis tasks. This research not only validates the use of LLMs for specialized biometric applications but also emphasizes the importance of nuanced query framing and interaction design in extracting significant insights from biometric data. Our findings suggest a promising path for future research and the development of more adaptable, efficient, robust and interactive biometric security solutions.
翻译:本研究利用GPT-4多模态大语言模型(LLM)的先进能力,探索其在虹膜识别领域的应用潜力——该领域相比人脸识别更为小众且专业化。通过聚焦这一细分但至关重要的领域,我们探究了ChatGPT等人工智能工具对虹膜图像的理解与分析能力。通过采用零样本学习方法精心设计的一系列实验,我们评估了ChatGPT-4在不同挑战性条件下的表现,包括多样化数据集、呈现攻击、眼镜遮挡等现实场景变化。研究结果表明ChatGPT-4具有卓越的适应性与精确度,既能熟练识别独特的虹膜特征,又能检测化妆等细微因素对虹膜识别的影响。与谷歌AI模型Gemini Advanced的对比分析显示,在复杂虹膜分析任务中ChatGPT-4具有更优的性能与用户体验。本研究不仅验证了大语言模型在专业生物识别应用中的可行性,同时强调了精细化查询构建与交互设计对于从生物特征数据中提取关键信息的重要性。我们的发现为未来研究指明了方向,并为开发更具适应性、高效性、鲁棒性与交互性的生物识别安全解决方案提供了可行路径。