It is in high demand to generate facial animation with high realism, but it remains a challenging task. Existing approaches of speech-driven facial animation can produce satisfactory mouth movement and lip synchronization, but show weakness in dramatic emotional expressions and flexibility in emotion control. This paper presents a novel deep learning-based approach for expressive facial animation generation from speech that can exhibit wide-spectrum facial expressions with controllable emotion type and intensity. We propose an emotion controller module to learn the relationship between the emotion variations (e.g., types and intensity) and the corresponding facial expression parameters. It enables emotion-controllable facial animation, where the target expression can be continuously adjusted as desired. The qualitative and quantitative evaluations show that the animation generated by our method is rich in facial emotional expressiveness while retaining accurate lip movement, outperforming other state-of-the-art methods.
翻译:生成高真实感的面部动画需求旺盛,但该任务仍具有挑战性。现有基于语音驱动的面部动画方法能产生令人满意的嘴部运动与唇形同步,但在戏剧化情感表达及情感控制灵活性方面存在不足。本文提出一种新颖的基于深度学习的表现性面部动画生成方法,可从语音中生成包含广谱面部表情、且情感类型与强度可控的动画。我们设计了情感控制器模块,用于学习情感变化(如类型与强度)与对应面部表情参数之间的映射关系。该模块实现了情感可控的面部动画,目标表情可根据需求连续调节。定性与定量评估表明,本方法生成的动画在保持精准唇部运动的同时,具有丰富的面部情感表现力,性能优于其他现有最优方法。