This paper presents DENSER, an efficient and effective approach leveraging 3D Gaussian splatting (3DGS) for the reconstruction of dynamic urban environments. While several methods for photorealistic scene representations, both implicitly using neural radiance fields (NeRF) and explicitly using 3DGS have shown promising results in scene reconstruction of relatively complex dynamic scenes, modeling the dynamic appearance of foreground objects tend to be challenging, limiting the applicability of these methods to capture subtleties and details of the scenes, especially far dynamic objects. To this end, we propose DENSER, a framework that significantly enhances the representation of dynamic objects and accurately models the appearance of dynamic objects in the driving scene. Instead of directly using Spherical Harmonics (SH) to model the appearance of dynamic objects, we introduce and integrate a new method aiming at dynamically estimating SH bases using wavelets, resulting in better representation of dynamic objects appearance in both space and time. Besides object appearance, DENSER enhances object shape representation through densification of its point cloud across multiple scene frames, resulting in faster convergence of model training. Extensive evaluations on KITTI dataset show that the proposed approach significantly outperforms state-of-the-art methods by a wide margin. Source codes and models will be uploaded to this repository https://github.com/sntubix/denser
翻译:本文提出DENSER,一种利用三维高斯泼溅(3DGS)实现动态城市场景重建的高效方法。尽管现有基于神经辐射场(NeRF)的隐式方法和基于3DGS的显式方法在复杂动态场景的光照真实重建中取得了显著进展,但对前景物体动态外观的建模仍具挑战性,限制了这些方法捕捉场景细节(尤其是远距离动态物体)的能力。为此,我们提出DENSER框架,该框架显著增强了对动态物体的表征能力,并精准建模了驾驶场景中动态物体的外观特性。区别于直接使用球谐函数(SH)建模动态物体外观的传统方法,我们创新性地引入并整合了基于小波动态估计SH基函数的新方法,从而在时空维度上实现了更优的动态物体外观表征。除物体外观外,DENSER通过在多帧场景中密集化点云来增强物体形状表征,从而加速模型训练的收敛过程。在KITTI数据集上的大量实验表明,本方法以显著优势超越现有最优方法。源代码与模型将发布于https://github.com/sntubix/denser。