3D Gaussian Splatting (3DGS) has demonstrated superior quality in modeling 3D objects and scenes. However, generating 3DGS remains challenging due to their discrete, unstructured, and permutation-invariant nature. In this work, we present a simple yet effective method to overcome these challenges. We utilize spherical mapping to transform 3DGS into a structured 2D representation, termed UVGS. UVGS can be viewed as multi-channel images, with feature dimensions as a concatenation of Gaussian attributes such as position, scale, color, opacity, and rotation. We further find that these heterogeneous features can be compressed into a lower-dimensional (e.g., 3-channel) shared feature space using a carefully designed multi-branch network. The compressed UVGS can be treated as typical RGB images. Remarkably, we discover that typical VAEs trained with latent diffusion models can directly generalize to this new representation without additional training. Our novel representation makes it effortless to leverage foundational 2D models, such as diffusion models, to directly model 3DGS. Additionally, one can simply increase the 2D UV resolution to accommodate more Gaussians, making UVGS a scalable solution compared to typical 3D backbones. This approach immediately unlocks various novel generation applications of 3DGS by inherently utilizing the already developed superior 2D generation capabilities. In our experiments, we demonstrate various unconditional, conditional generation, and inpainting applications of 3DGS based on diffusion models, which were previously non-trivial.
翻译:3D高斯泼溅(3DGS)在三维物体与场景建模中已展现出卓越的质量。然而,由于其离散、非结构化及排列不变性的本质,生成3DGS仍面临挑战。本研究提出一种简洁而高效的方法以应对这些挑战。我们利用球面映射将3DGS转换为结构化的二维表示,称为UVGS。UVGS可视为多通道图像,其特征维度由高斯属性(如位置、尺度、颜色、不透明度与旋转)拼接而成。我们进一步发现,通过精心设计的多分支网络,这些异构特征可被压缩至更低维度(例如三通道)的共享特征空间。压缩后的UVGS可被视作典型的RGB图像。值得注意的是,我们发现经潜在扩散模型训练的典型变分自编码器无需额外训练即可直接泛化至该新表示形式。我们提出的新表示方法使得能够轻松利用基础二维模型(如扩散模型)直接对3DGS进行建模。此外,仅需提高二维UV分辨率即可容纳更多高斯单元,使UVGS相较于典型三维主干网络成为可扩展的解决方案。该方法通过内在利用已成熟的先进二维生成能力,立即解锁了3DGS多种新颖的生成应用。实验中,我们展示了基于扩散模型的3DGS在无条件生成、条件生成及修复等多类应用,这些应用在以往均非易事。