We introduce 3D Gaussian blendshapes for modeling photorealistic head avatars. Taking a monocular video as input, we learn a base head model of neutral expression, along with a group of expression blendshapes, each of which corresponds to a basis expression in classical parametric face models. Both the neutral model and expression blendshapes are represented as 3D Gaussians, which contain a few properties to depict the avatar appearance. The avatar model of an arbitrary expression can be effectively generated by combining the neutral model and expression blendshapes through linear blending of Gaussians with the expression coefficients. High-fidelity head avatar animations can be synthesized in real time using Gaussian splatting. Compared to state-of-the-art methods, our Gaussian blendshape representation better captures high-frequency details exhibited in input video, and achieves superior rendering performance.
翻译:我们提出3D高斯混合形状,用于建模逼真的头部虚拟形象。以单目视频为输入,我们学习一个中性表情的基础头部模型,以及一组表情混合形状,每个混合形状对应经典参数化人脸模型中的基础表情。中性模型和表情混合形状均表示为3D高斯体,包含若干属性以描述虚拟形象外观。通过将高斯体与表情系数进行线性混合,组合中性模型和表情混合形状,可有效生成任意表情的虚拟形象模型。利用高斯体溅射技术,可实时合成高保真的头部虚拟形象动画。与现有最优方法相比,我们的高斯混合形状表示能更好捕获输入视频中的高频细节,并实现更卓越的渲染性能。