We present a differentiable framework to automatically learn view-dependent 2D kernels in a splatting-based pipeline to improve reconstruction quality and representation efficiency for novel 3D view synthesis. Our volumetric primitive is defined as a bounding ellipsoid and a 3D-kernel latent vector. We first learn a projection network to output a 2D-kernel latent, taking the attributes of the ellipsoid and the 3D-kernel latent as input. Next, the result is sent to a decoder to produce a radially symmetric 2D kernel in terms of Mahalanobis distance, bounded by the projected ellipsoid. The neural networks along with per-primitive attributes are jointly optimized. The effectiveness of our approach is demonstrated on standard benchmarks, comparing favorably against state-of-the-art techniques on both analytical and learned kernels. Finally, we extend the idea to learn general 2D kernels for 2D splatting as well as image representation.
翻译:我们提出了一种可微框架,用于在基于点渲染的流水线中自动学习视角相关的二维核,以提升新型三维视图合成的重建质量与表示效率。我们的体素基元定义为包围椭球体与三维核潜向量。首先,我们学习一个投影网络,该网络以椭球体属性与三维核潜向量为输入,输出二维核潜变量。随后,该结果被送入解码器,生成基于马氏距离的径向对称二维核,其范围受投影椭球体边界约束。神经网络与每个基元的属性将联合优化。该方法在标准基准测试中展现出有效性,并与基于解析核与学习核的最新技术相比具有竞争力。最后,我们将该思想扩展到学习用于二维点渲染及图像表示的通用二维核。