We present Drivable 3D Gaussian Avatars (D3GA), the first 3D controllable model for human bodies rendered with Gaussian splats. Current photorealistic drivable avatars require either accurate 3D registrations during training, dense input images during testing, or both. The ones based on neural radiance fields also tend to be prohibitively slow for telepresence applications. This work uses the recently presented 3D Gaussian Splatting (3DGS) technique to render realistic humans at real-time framerates, using dense calibrated multi-view videos as input. To deform those primitives, we depart from the commonly used point deformation method of linear blend skinning (LBS) and use a classic volumetric deformation method: cage deformations. Given their smaller size, we drive these deformations with joint angles and keypoints, which are more suitable for communication applications. Our experiments on nine subjects with varied body shapes, clothes, and motions obtain higher-quality results than state-of-the-art methods when using the same training and test data.
翻译:我们提出可驾驶3D高斯化身(D3GA),这是首个通过高斯溅射技术渲染人体且具备三维可控能力的模型。当前具备照片级真实感的可驾驶化身模型,要么需要在训练时获取精确的三维配准数据,要么在测试时依赖密集的输入图像,甚至两者缺一不可。基于神经辐射场的方法在远程呈现应用中往往存在严重的速度瓶颈。本工作采用最新提出的三维高斯溅射(3DGS)技术,以密集校准多视角视频作为输入,实现实时帧率下逼真人体渲染。为驱动这些高斯基元,我们摒弃了常用的线性混合蒙皮(LBS)点变形方法,转而采用经典体积变形技术:笼形变形。鉴于笼形变形的计算规模更小,我们使用关节角度与关键点驱动这些变形,更适配通信应用场景。在涵盖九位不同体型、服装及动作的受试者实验中,本方法在训练与测试数据相同的条件下,取得了优于现有最优方法的画质表现。