Gaussian Splatting has enabled real-time 3D human avatars with unprecedented levels of visual quality. While previous methods require a desktop GPU for real-time inference of a single avatar, we aim to squeeze multiple Gaussian avatars onto a portable virtual reality headset with real-time drivable inference. We begin by training a previous work, Animatable Gaussians, on a high quality dataset captured with 512 cameras. The Gaussians are animated by controlling base set of Gaussians with linear blend skinning (LBS) motion and then further adjusting the Gaussians with a neural network decoder to correct their appearance. When deploying the model on a Meta Quest 3 VR headset, we find two major computational bottlenecks: the decoder and the rendering. To accelerate the decoder, we train the Gaussians in UV-space instead of pixel-space, and we distill the decoder to a single neural network layer. Further, we discover that neighborhoods of Gaussians can share a single corrective from the decoder, which provides an additional speedup. To accelerate the rendering, we develop a custom pipeline in Vulkan that runs on the mobile GPU. Putting it all together, we run 3 Gaussian avatars concurrently at 72 FPS on a VR headset. Demo videos are at https://forresti.github.io/squeezeme.
翻译:高斯泼溅技术已实现了具有前所未有的视觉质量的实时三维人体化身。尽管先前方法需要桌面级GPU才能对单个化身进行实时推理,但我们的目标是将多个高斯化身压缩至便携式虚拟现实头戴设备中,并实现实时可驱动推理。我们首先使用512台摄像机采集的高质量数据集训练先前工作——可动画高斯模型。通过线性混合蒙皮运动控制基础高斯集合来驱动高斯动画,并利用神经网络解码器进一步调整高斯分布以修正其外观。当在Meta Quest 3 VR头戴设备上部署该模型时,我们发现存在两个主要计算瓶颈:解码器与渲染环节。为加速解码过程,我们在UV空间而非像素空间训练高斯分布,并将解码器蒸馏至单层神经网络。此外,我们发现相邻高斯区域可共享解码器的单一修正参数,从而获得额外加速效果。为提升渲染效率,我们在移动GPU上开发了基于Vulkan的自定义渲染管线。整合所有优化后,我们在VR头戴设备上以72 FPS的帧率同时运行3个高斯化身。演示视频详见 https://forresti.github.io/squeezeme。