Neural Radiance Fields (NeRF) employ multi-view images for 3D scene representation and have shown remarkable performance. As one of the primary sources of multi-view images, multi-camera systems encounter challenges such as varying intrinsic parameters and frequent pose changes. Most previous NeRF-based methods often assume a global unique camera and seldom consider scenarios with multiple cameras. Besides, some pose-robust methods still remain susceptible to suboptimal solutions when poses are poor initialized. In this paper, we propose MC-NeRF, a method can jointly optimize both intrinsic and extrinsic parameters for bundle-adjusting Neural Radiance Fields. Firstly, we conduct a theoretical analysis to tackle the degenerate case and coupling issue that arise from the joint optimization between intrinsic and extrinsic parameters. Secondly, based on the proposed solutions, we introduce an efficient calibration image acquisition scheme for multi-camera systems, including the design of calibration object. Lastly, we present a global end-to-end network with training sequence that enables the regression of intrinsic and extrinsic parameters, along with the rendering network. Moreover, most existing datasets are designed for unique camera, we create a new dataset that includes four different styles of multi-camera acquisition systems, allowing readers to generate custom datasets. Experiments confirm the effectiveness of our method when each image corresponds to different camera parameters. Specifically, we adopt up to 110 images with 110 different intrinsic and extrinsic parameters, to achieve 3D scene representation without providing initial poses. The Code and supplementary materials are available at https://in2-viaun.github.io/MC-NeRF.
翻译:神经辐射场(Neural Radiance Fields, NeRF)利用多视角图像进行三维场景表示,并展现出卓越性能。作为多视角图像的主要来源之一,多相机系统面临内参变量频繁变化和位姿动态调整等挑战。以往多数基于NeRF的方法通常假设全局唯一相机模型,鲜少考虑多相机场景。此外,部分对位姿鲁棒的方法在初始位姿较差时仍易陷入次优解。本文提出MC-NeRF,一种能够联合优化内外参数以实现光束法平差神经辐射场的方法。首先,针对内外参数联合优化中出现的退化情形与耦合问题,我们进行了理论分析;其次,基于所提解决方案,我们为多相机系统设计了一套高效的标定图像采集方案,包含标定物设计;最后,我们提出包含训练序列的全局端到端网络,可同时回归内外参数及渲染网络。此外,鉴于现有数据集多针对单一相机设计,我们构建了涵盖四种不同风格多相机采集系统的新数据集,支持用户定制数据生成。实验证实了本方法在每幅图像对应不同相机参数时的有效性:具体采用110幅具有不同内外参数的图像,在不提供初始位姿的条件下实现三维场景表示。代码与补充材料见https://in2-viaun.github.io/MC-NeRF。