VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer

Current talking face generation methods mainly focus on speech-lip synchronization. However, insufficient investigation on the facial talking style leads to a lifeless and monotonous avatar. Most previous works fail to imitate expressive styles from arbitrary video prompts and ensure the authenticity of the generated video. This paper proposes an unsupervised variational style transfer model (VAST) to vivify the neutral photo-realistic avatars. Our model consists of three key components: a style encoder that extracts facial style representations from the given video prompts; a hybrid facial expression decoder to model accurate speech-related movements; a variational style enhancer that enhances the style space to be highly expressive and meaningful. With our essential designs on facial style learning, our model is able to flexibly capture the expressive facial style from arbitrary video prompts and transfer it onto a personalized image renderer in a zero-shot manner. Experimental results demonstrate the proposed approach contributes to a more vivid talking avatar with higher authenticity and richer expressiveness.

翻译：目前的说话人脸生成方法主要专注于语音与嘴唇的同步。然而，对面部说话风格的探索不足导致生成的虚拟人缺乏生气且单调。大多数先前的工作未能从任意视频提示中模仿出富有表现力的风格，同时保证生成视频的真实性。本文提出了一种无监督的变分风格迁移模型（VAST），旨在为中性逼真的虚拟人注入活力。我们的模型由三个关键部分组成：一个从给定视频提示中提取面部风格表示的风格编码器；一个用于建模与语音相关的精确运动的混合面部表情解码器；一个增强风格空间使其高度表达且富有意义的变分风格增强器。通过我们在面部风格学习上的重要设计，该模型能够灵活地从任意视频提示中捕捉富有表现力的面部风格，并以零样本方式将其迁移到个性化图像渲染器上。实验结果表明，所提出的方法有助于生成更生动、具有更高真实性和更强表现力的说话虚拟人。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日