NeRF is a popular model that efficiently represents 3D objects from 2D images. However, vanilla NeRF has some important limitations. NeRF must be trained on each object separately. The training time is long since we encode the object's shape and color in neural network weights. Moreover, NeRF does not generalize well to unseen data. In this paper, we present MultiPlaneNeRF -- a model that simultaneously solves the above problems. Our model works directly on 2D images. We project 3D points on 2D images to produce non-trainable representations. The projection step is not parametrized and a very shallow decoder can efficiently process the representation. Furthermore, we can train MultiPlaneNeRF on a large data set and force our implicit decoder to generalize across many objects. Consequently, we can only replace the 2D images (without additional training) to produce a NeRF representation of the new object. In the experimental section, we demonstrate that MultiPlaneNeRF achieves results comparable to state-of-the-art models for synthesizing new views and has generalization properties. Additionally, MultiPlane decoder can be used as a component in large generative models like GANs.
翻译:神经辐射场(NeRF)是一种能够从二维图像高效表示三维物体的流行模型。然而,原始NeRF存在若干重要局限:NeRF必须针对每个物体单独训练;由于将物体的形状与颜色信息编码于神经网络权重中,其训练时间较长;此外,NeRF对未见数据的泛化能力较弱。本文提出MultiPlaneNeRF——一种能同时解决上述问题的模型。该模型直接在二维图像上操作,通过将三维点投影至二维图像生成非可训练表征。此投影步骤无需参数化,仅需极浅的解码器即可高效处理该表征。进一步地,我们可在大型数据集上训练MultiPlaneNeRF,迫使隐式解码器学习跨物体的泛化能力。因此,仅需替换二维图像(无需额外训练)即可为新物体生成NeRF表征。实验部分表明,MultiPlaneNeRF在新视角合成任务上取得了与前沿模型相当的结果,并展现出良好的泛化特性。此外,MultiPlane解码器可作为生成对抗网络(GANs)等大型生成模型的组件使用。