This work presents Interactive Conversational 3D Virtual Human (ICo3D), a method for generating an interactive, conversational, and photorealistic 3D human avatar. Based on multi-view captures of a subject, we create an animatable 3D face model and a dynamic 3D body model, both rendered by splatting Gaussian primitives. Once merged together, they represent a lifelike virtual human avatar suitable for real-time user interactions. We equip our avatar with an LLM for conversational ability. During conversation, the audio speech of the avatar is used as a driving signal to animate the face model, enabling precise synchronization. We describe improvements to our dynamic Gaussian models that enhance photorealism: SWinGS++ for body reconstruction and HeadGaS++ for face reconstruction, and provide as well a solution to merge the separate face and body models without artifacts. We also present a demo of the complete system, showcasing several use cases of real-time conversation with the 3D avatar. Our approach offers a fully integrated virtual avatar experience, supporting both oral and written form interactions in immersive environments. ICo3D is applicable to a wide range of fields, including gaming, virtual assistance, and personalized education, among others. Project page: https://ico3d.github.io/
翻译:本文提出了一种交互式会话三维虚拟人(ICo3D)方法,用于生成交互式、可会话且具有照片级真实感的三维人体化身。基于对目标对象的多视角捕捉,我们创建了一个可动画化的三维面部模型和一个动态三维身体模型,两者均通过高斯图元溅射进行渲染。合并后,它们构成了一个逼真的虚拟人化身,适用于实时用户交互。我们为该化身配备了大型语言模型以赋予其会话能力。在对话过程中,化身的音频语音被用作驱动信号来动画化面部模型,从而实现精确的同步。我们描述了增强照片级真实感的动态高斯模型改进:用于身体重建的SWinGS++和用于面部重建的HeadGaS++,并提供了无伪影地合并独立面部与身体模型的解决方案。我们还展示了完整系统的演示,呈现了与三维化身进行实时会话的多个用例。我们的方法提供了一个完全集成的虚拟化身体验,支持在沉浸式环境中进行口头和书面形式的交互。ICo3D可广泛应用于游戏、虚拟助手和个性化教育等多个领域。项目页面:https://ico3d.github.io/