3D Gaussian splatting (3DGS) has recently emerged as an alternative representation that leverages a 3D Gaussian-based representation and introduces an approximated volumetric rendering, achieving very fast rendering speed and promising image quality. Furthermore, subsequent studies have successfully extended 3DGS to dynamic 3D scenes, demonstrating its wide range of applications. However, a significant drawback arises as 3DGS and its following methods entail a substantial number of Gaussians to maintain the high fidelity of the rendered images, which requires a large amount of memory and storage. To address this critical issue, we place a specific emphasis on two key objectives: reducing the number of Gaussian points without sacrificing performance and compressing the Gaussian attributes, such as view-dependent color and covariance. To this end, we propose a learnable mask strategy that significantly reduces the number of Gaussians while preserving high performance. In addition, we propose a compact but effective representation of view-dependent color by employing a grid-based neural field rather than relying on spherical harmonics. Finally, we learn codebooks to compactly represent the geometric and temporal attributes by residual vector quantization. With model compression techniques such as quantization and entropy coding, we consistently show over 25x reduced storage and enhanced rendering speed compared to 3DGS for static scenes, while maintaining the quality of the scene representation. For dynamic scenes, our approach achieves more than 12x storage efficiency and retains a high-quality reconstruction compared to the existing state-of-the-art methods. Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering. Our project page is available at https://maincold2.github.io/c3dgs/.
翻译:3D高斯溅射(3DGS)近期作为一种新兴表征方法出现,它利用基于3D高斯的表示形式并引入近似体渲染技术,实现了极快的渲染速度与优异的图像质量。后续研究已成功将3DGS扩展至动态三维场景,展现了其广泛的应用潜力。然而,3DGS及其衍生方法需要大量高斯函数以维持渲染图像的高保真度,这导致内存与存储开销巨大。为攻克此关键问题,我们聚焦两大核心目标:在不损失性能的前提下减少高斯点数量,并压缩高斯属性(如视点相关色彩与协方差矩阵)。为此,我们提出可学习掩码策略,在保持高性能的同时显著降低高斯函数数量。此外,我们通过采用基于网格的神经场替代球谐函数,构建了紧凑高效的视点相关色彩表示。最后,我们通过残差向量量化学习码本,以紧凑形式表征几何与时间属性。结合量化与熵编码等模型压缩技术,在静态场景中相比3DGS实现了超过25倍的存储缩减与渲染加速,同时保持场景表征质量。对于动态场景,相较于现有先进方法,我们的方案获得超过12倍的存储效率并维持高质量重建。本研究为三维场景表征提供了兼顾高性能、快速训练、紧凑存储与实时渲染的完整框架。项目页面详见 https://maincold2.github.io/c3dgs/。