We consider the problem of novel view synthesis (NVS) for dynamic scenes. Recent neural approaches have accomplished exceptional NVS results for static 3D scenes, but extensions to 4D time-varying scenes remain non-trivial. Prior efforts often encode dynamics by learning a canonical space plus implicit or explicit deformation fields, which struggle in challenging scenarios like sudden movements or capturing high-fidelity renderings. In this paper, we introduce 4D Gaussian Splatting (4DGS), a novel method that represents dynamic scenes with anisotropic 4D XYZT Gaussians, inspired by the success of 3D Gaussian Splatting in static scenes. We model dynamics at each timestamp by temporally slicing the 4D Gaussians, which naturally compose dynamic 3D Gaussians and can be seamlessly projected into images. As an explicit spatial-temporal representation, 4DGS demonstrates powerful capabilities for modeling complicated dynamics and fine details, especially for scenes with abrupt motions. We further implement our temporal slicing and splatting techniques in a highly optimized CUDA acceleration framework, achieving real-time inference rendering speeds of up to 277 FPS on an RTX 3090 GPU and 583 FPS on an RTX 4090 GPU. Rigorous evaluations on scenes with diverse motions showcase the superior efficiency and effectiveness of 4DGS, which consistently outperforms existing methods both quantitatively and qualitatively.
翻译:我们研究动态场景的新视角合成(NVS)问题。近期基于神经网络的方案已在静态3D场景的新视角合成中取得优异结果,但将其扩展至4D时变场景仍具挑战性。现有方法通常通过学习规范空间并辅以隐式或显式变形场来编码动态,在处理突然运动或实现高保真渲染时难以应对复杂场景。受静态场景中3D高斯泼溅成功的启发,本文提出4D高斯泼溅(4DGS)——一种采用各向异性4D XYZT高斯表示动态场景的新方法。通过时间维度切片4D高斯来建模每个时间戳的动态信息,这些切片自然组合成动态3D高斯,并可直接投影到图像中。作为一种显式时空表示,4DGS展现出建模复杂运动与精细细节的强大能力,尤其适用于具有突发运动的场景。我们在高度优化的CUDA加速框架中实现时间切片与泼溅技术,在RTX 3090 GPU上达到277 FPS的实时推理渲染速度,在RTX 4090 GPU上达到583 FPS。针对多样运动场景的严格评估表明,4DGS在量化指标与视觉质量上均持续优于现有方法,展现出卓越的效率和有效性。