Neural Radiance Fields (NeRF) have demonstrated very impressive performance in novel view synthesis via implicitly modelling 3D representations from multi-view 2D images. However, most existing studies train NeRF models with either reasonable camera pose initialization or manually-crafted camera pose distributions which are often unavailable or hard to acquire in various real-world data. We design VMRF, an innovative view matching NeRF that enables effective NeRF training without requiring prior knowledge in camera poses or camera pose distributions. VMRF introduces a view matching scheme, which exploits unbalanced optimal transport to produce a feature transport plan for mapping a rendered image with randomly initialized camera pose to the corresponding real image. With the feature transport plan as the guidance, a novel pose calibration technique is designed which rectifies the initially randomized camera poses by predicting relative pose transformations between the pair of rendered and real images. Extensive experiments over a number of synthetic and real datasets show that the proposed VMRF outperforms the state-of-the-art qualitatively and quantitatively by large margins.
翻译:神经辐射场(NeRF)通过从多视角二维图像隐式建模三维表示,在新视角合成任务中展现了非常出色的性能。然而,现有研究大多在合理的相机位姿初始化或手动设计的相机位姿分布条件下训练NeRF模型,而这些条件在多种真实世界数据中往往不可获得或难以获取。我们设计了VMRF,一种创新的视图匹配神经辐射场,能够在无需先验相机位姿或位姿分布知识的情况下实现有效的NeRF训练。VMRF引入了一种视图匹配方案,利用非平衡最优传输生成特征传输计划,将随机初始化相机位姿下的渲染图像映射到对应的真实图像。以该特征传输计划为指导,设计了一种新颖的位姿校准技术,通过预测渲染图像与真实图像对之间的相对位姿变换,校正初始随机化的相机位姿。在多个合成和真实数据集上的大量实验表明,所提出的VMRF在定性和定量指标上均大幅超越现有最优方法。