Currently, the diagnosis of facial paralysis remains a challenging task, often relying heavily on the subjective judgment and experience of clinicians, which can introduce variability and uncertainty in the assessment process. One promising application in real-life situations is the automatic estimation of facial paralysis. However, the scarcity of facial paralysis datasets limits the development of robust machine learning models for automated diagnosis and therapeutic interventions. To this end, this study aims to synthesize a high-quality facial paralysis dataset to address this gap, enabling more accurate and efficient algorithm training. Specifically, a novel Cross-Fusion Cycle Palsy Expression Generative Model (CFCPalsy) based on the diffusion model is proposed to combine different features of facial information and enhance the visual details of facial appearance and texture in facial regions, thus creating synthetic facial images that accurately represent various degrees and types of facial paralysis. We have qualitatively and quantitatively evaluated the proposed method on the commonly used public clinical datasets of facial paralysis to demonstrate its effectiveness. Experimental results indicate that the proposed method surpasses state-of-the-art methods, generating more realistic facial images and maintaining identity consistency.
翻译:目前,面部瘫痪的诊断仍是一项具有挑战性的任务,通常严重依赖于临床医生的主观判断和经验,这可能在评估过程中引入变异性和不确定性。在实际应用中,一个前景广阔的方向是面部瘫痪的自动评估。然而,面部瘫痪数据集的稀缺性限制了用于自动诊断和治疗干预的鲁棒机器学习模型的发展。为此,本研究旨在合成高质量的面部瘫痪数据集以弥补这一不足,从而实现更准确、更高效的算法训练。具体而言,本文提出了一种基于扩散模型的新型交叉融合循环瘫痪表情生成模型(CFCPalsy),该模型能够融合面部信息的不同特征,并增强面部区域外观和纹理的视觉细节,从而生成能准确表征不同程度和类型面部瘫痪的合成面部图像。我们在常用的公共临床面部瘫痪数据集上对所提方法进行了定性和定量评估,以证明其有效性。实验结果表明,所提方法超越了现有先进方法,能够生成更逼真的面部图像并保持身份一致性。