We consider the problem of novel view synthesis (NVS) for dynamic scenes. Recent neural approaches have accomplished exceptional NVS results for static 3D scenes, but extensions to 4D time-varying scenes remain non-trivial. Prior efforts often encode dynamics by learning a canonical space plus implicit or explicit deformation fields, which struggle in challenging scenarios like sudden movements or capturing high-fidelity renderings. In this paper, we introduce 4D Gaussian Splatting (4DGS), a novel method that represents dynamic scenes with anisotropic 4D XYZT Gaussians, inspired by the success of 3D Gaussian Splatting in static scenes. We model dynamics at each timestamp by temporally slicing the 4D Gaussians, which naturally compose dynamic 3D Gaussians and can be seamlessly projected into images. As an explicit spatial-temporal representation, 4DGS demonstrates powerful capabilities for modeling complicated dynamics and fine details, especially for scenes with abrupt motions. We further implement our temporal slicing and splatting techniques in a highly optimized CUDA acceleration framework, achieving real-time inference rendering speeds of up to 277 FPS on an RTX 3090 GPU and 583 FPS on an RTX 4090 GPU. Rigorous evaluations on scenes with diverse motions showcase the superior efficiency and effectiveness of 4DGS, which consistently outperforms existing methods both quantitatively and qualitatively.
翻译:我们研究了动态场景的新视角合成(NVS)问题。近期神经方法在静态三维场景的NVS中取得了优异成果,但其向四维时变场景的扩展仍面临重大挑战。现有方法通常通过学习规范空间及隐式或显式形变场来编码动态信息,在剧烈运动或高保真渲染等复杂场景中表现不佳。本文提出4D高斯泼溅(4DGS),受静态场景中3D高斯泼溅成功经验的启发,该方法采用各向异性的四维XYZT高斯模型表征动态场景。我们通过时间切片四维高斯模型在各时间戳的动力学特性,其天然可组合成动态三维高斯模型,并实现到图像的无缝投影。作为显式时空表征,4DGS在建模复杂动态和细节方面展现出强大能力,尤其适用于包含突发运动的场景。我们进一步将时间切片与泼溅技术集成到高度优化的CUDA加速框架中,在RTX 3090 GPU上实现高达277 FPS、RTX 4090 GPU上实现583 FPS的实时推理渲染速度。对多样化运动场景的严格评估表明,4DGS在定量与定性指标上持续优于现有方法,展现出卓越的效率与有效性。