The growing demand for diverse and high-quality facial datasets for training and testing biometric systems is challenged by privacy regulations, data scarcity, and ethical concerns. Synthetic facial images offer a potential solution, yet existing generative models often struggle to balance realism, diversity, and identity preservation. This paper presents SCHIGAND, a novel synthetic face generation pipeline integrating StyleCLIP, HyperStyle, InterfaceGAN, and Diffusion models to produce highly realistic and controllable facial datasets. SCHIGAND enhances identity preservation while generating realistic intra-class variations and maintaining inter-class distinctiveness, making it suitable for biometric testing. The generated datasets were evaluated using ArcFace, a leading facial verification model, to assess their effectiveness in comparison to real-world facial datasets. Experimental results demonstrate that SCHIGAND achieves a balance between image quality and diversity, addressing key limitations of prior generative models. This research highlights the potential of SCHIGAND to supplement and, in some cases, replace real data for facial biometric applications, paving the way for privacy-compliant and scalable solutions in synthetic dataset generation.
翻译:随着生物识别系统训练与测试对多样化、高质量人脸数据集需求的日益增长,隐私法规、数据稀缺性及伦理问题构成了严峻挑战。合成人脸图像提供了一种潜在的解决方案,然而现有生成模型往往难以在真实性、多样性与身份保持之间取得平衡。本文提出SCHIGAND,一种集成StyleCLIP、HyperStyle、InterfaceGAN与Diffusion模型的新型合成人脸生成流水线,用于生成高度真实且可控的人脸数据集。SCHIGAND在生成逼真类内变化并保持类间区分度的同时,增强了身份保持能力,使其适用于生物识别测试。通过采用领先的人脸验证模型ArcFace对生成数据集进行评估,以检验其相对于真实世界人脸数据集的有效性。实验结果表明,SCHIGAND在图像质量与多样性之间取得了平衡,解决了先前生成模型的关键局限性。本研究彰显了SCHIGAND在人脸生物识别应用中补充乃至替代真实数据的潜力,为合成数据集生成领域实现隐私合规与可扩展解决方案开辟了新路径。