Neural Radiance Fields (NeRFs) have demonstrated the remarkable potential of neural networks to capture the intricacies of 3D objects. By encoding the shape and color information within neural network weights, NeRFs excel at producing strikingly sharp novel views of 3D objects. Recently, numerous generalizations of NeRFs utilizing generative models have emerged, expanding its versatility. In contrast, Gaussian Splatting (GS) offers a similar render quality with faster training and inference as it does not need neural networks to work. It encodes information about the 3D objects in the set of Gaussian distributions that can be rendered in 3D similarly to classical meshes. Unfortunately, GS are difficult to condition since they usually require circa hundred thousand Gaussian components. To mitigate the caveats of both models, we propose a hybrid model Viewing Direction Gaussian Splatting (VDGS) that uses GS representation of the 3D object's shape and NeRF-based encoding of color and opacity. Our model uses Gaussian distributions with trainable positions (i.e. means of Gaussian), shape (i.e. covariance of Gaussian), color and opacity, and a neural network that takes Gaussian parameters and viewing direction to produce changes in the said color and opacity. As a result, our model better describes shadows, light reflections, and the transparency of 3D objects without adding additional texture and light components.
翻译:神经辐射场(NeRF)展现了神经网络捕捉三维物体复杂细节的卓越潜力。通过将形状与颜色信息编码至神经网络权重中,NeRF在生成三维物体锐利新视角方面表现尤为出色。近年来,利用生成模型的NeRF泛化方法层出不穷,进一步拓展了其应用范围。相比之下,高斯泼溅(GS)能以更快的训练与推理速度提供相近的渲染质量,因其无需依赖神经网络运行。它将三维物体信息编码至高斯分布集合中,可像经典网格模型一样进行三维渲染。然而,GS难以进行条件控制,通常需要约十万个高斯分量。为缓解两种模型的局限性,我们提出混合模型——视角方向高斯泼溅(VDGS),该模型采用GS表示三维物体形状,并利用基于NeRF的编码方式处理颜色与不透明度。本模型使用可训练位置(高斯均值)、形状(高斯协方差)、颜色与不透明度的高斯分布,同时引入神经网络,根据高斯参数与视角方向对上述颜色与不透明度进行动态调整。实验表明,本模型无需额外添加纹理与光照组件,即可更精准地描述三维物体的阴影、光线反射及透明效果。