Neural Radiance Fields (NeRFs) have demonstrated remarkable proficiency in synthesizing photorealistic images of large-scale scenes. However, they are often plagued by a loss of fine details and long rendering durations. 3D Gaussian Splatting has recently been introduced as a potent alternative, achieving both high-fidelity visual results and accelerated rendering performance. Nonetheless, scaling 3D Gaussian Splatting is fraught with challenges. Specifically, large-scale scenes grapples with the integration of objects across multiple scales and disparate viewpoints, which often leads to compromised efficacy as the Gaussians need to balance between detail levels. Furthermore, the generation of initialization points via COLMAP from large-scale dataset is both computationally demanding and prone to incomplete reconstructions. To address these challenges, we present Pyramidal 3D Gaussian Splatting (PyGS) with NeRF Initialization. Our approach represent the scene with a hierarchical assembly of Gaussians arranged in a pyramidal fashion. The top level of the pyramid is composed of a few large Gaussians, while each subsequent layer accommodates a denser collection of smaller Gaussians. We effectively initialize these pyramidal Gaussians through sampling a rapidly trained grid-based NeRF at various frequencies. We group these pyramidal Gaussians into clusters and use a compact weighting network to dynamically determine the influence of each pyramid level of each cluster considering camera viewpoint during rendering. Our method achieves a significant performance leap across multiple large-scale datasets and attains a rendering time that is over 400 times faster than current state-of-the-art approaches.
翻译:神经辐射场(NeRFs)在合成大规模场景的光照真实感图像方面展现出卓越能力,但其常受限于细节丢失与渲染耗时过长的问题。3D高斯抛雪球技术近期被提出作为一种高效替代方案,既能实现高保真视觉效果,又能显著提升渲染速度。然而,扩展3D高斯抛雪球技术仍面临诸多挑战。具体而言,大规模场景需融合多尺度及多视角的物体信息,而高斯函数需在不同细节层级间权衡,常导致表征效能下降。此外,基于COLMAP从大规模数据集生成初始化点云的计算成本高昂,且易产生不完整重建结果。为应对这些挑战,本文提出基于NeRF初始化的金字塔式3D高斯抛雪球方法(PyGS)。该方法采用金字塔式排列的层次化高斯函数集合表征场景:金字塔顶层由少量大尺度高斯函数构成,后续每层则包含更密集的小尺度高斯函数。我们通过在不同频率下采样快速训练的基于网格的NeRF,高效初始化这些金字塔式高斯函数。在渲染过程中,我们将这些金字塔式高斯函数分组为聚类,并利用紧凑的权重网络结合相机视角动态确定每个聚类内各金字塔层级的影响权重。本方法在多个大规模数据集上实现了显著的性能突破,渲染速度较当前最优方法提升超过400倍。