Novel view synthesis (NVS) is a challenging task in computer vision that involves synthesizing new views of a scene from a limited set of input images. Neural Radiance Fields (NeRF) have emerged as a powerful approach to address this problem, but they require accurate knowledge of camera \textit{intrinsic} and \textit{extrinsic} parameters. Traditionally, structure-from-motion (SfM) and multi-view stereo (MVS) approaches have been used to extract camera parameters, but these methods can be unreliable and may fail in certain cases. In this paper, we propose a novel technique that leverages unposed images from dynamic datasets, such as the NVIDIA dynamic scenes dataset, to learn camera parameters directly from data. Our approach is highly extensible and can be integrated into existing NeRF architectures with minimal modifications. We demonstrate the effectiveness of our method on a variety of static and dynamic scenes and show that it outperforms traditional SfM and MVS approaches. The code for our method is publicly available at \href{https://github.com/redacted/refinerf}{https://github.com/redacted/refinerf}. Our approach offers a promising new direction for improving the accuracy and robustness of NVS using NeRF, and we anticipate that it will be a valuable tool for a wide range of applications in computer vision and graphics.
翻译:新视角合成(NVS)是计算机视觉中的一项具有挑战性的任务,它涉及从有限的输入图像集合成立场的新视角。神经辐射场(NeRF)已成为解决该问题的强大方法,但需要精确的相机内参和外参信息。传统上,运动恢复结构(SfM)和多视图立体(MVS)方法被用于提取相机参数,但这些方法在某些情况下可能不可靠甚至失效。本文提出一种新技术,利用来自动态数据集(如NVIDIA动态场景数据集)的无位姿图像,直接从数据中学习相机参数。该方法具有高度可扩展性,可通过最小修改集成到现有NeRF架构中。我们在多种静态和动态场景上验证了方法的有效性,并证明其优于传统SfM和MVS方法。本文代码已在公开仓库\href{https://github.com/redacted/refinerf}{https://github.com/redacted/refinerf}中提供。该研究为利用NeRF提升新视角合成的精度与鲁棒性提供了有前景的新方向,预期将成为计算机视觉与图形学领域多种应用的重要工具。