Neural Radiance Fields (NeRF) use multi-view images for 3D scene representation, demonstrating remarkable performance. As one of the primary sources of multi-view images, multi-camera systems encounter challenges such as varying intrinsic parameters and frequent pose changes. Most previous NeRF-based methods assume a unique camera and rarely consider multi-camera scenarios. Besides, some NeRF methods that can optimize intrinsic and extrinsic parameters still remain susceptible to suboptimal solutions when these parameters are poor initialized. In this paper, we propose MC-NeRF, a method that enables joint optimization of both intrinsic and extrinsic parameters alongside NeRF. The method also supports each image corresponding to independent camera parameters. First, we tackle coupling issue and the degenerate case that arise from the joint optimization between intrinsic and extrinsic parameters. Second, based on the proposed solutions, we introduce an efficient calibration image acquisition scheme for multi-camera systems, including the design of calibration object. Finally, we present an end-to-end network with training sequence that enables the estimation of intrinsic and extrinsic parameters, along with the rendering network. Furthermore, recognizing that most existing datasets are designed for a unique camera, we construct a real multi-camera image acquisition system and create a corresponding new dataset, which includes both simulated data and real-world captured images. Experiments confirm the effectiveness of our method when each image corresponds to different camera parameters. Specifically, we use multi-cameras, each with different intrinsic and extrinsic parameters in real-world system, to achieve 3D scene representation without providing initial poses.
翻译:神经辐射场(NeRF)利用多视角图像进行三维场景表示,展现出卓越的性能。作为多视角图像的主要来源之一,多相机系统面临内参各异、位姿频繁变动等挑战。以往大多数基于NeRF的方法通常假设相机唯一,鲜少考虑多相机场景。此外,部分能够优化内外参数的NeRF方法在参数初始化不佳时,仍易陷入次优解。本文提出MC-NeRF,该方法能够实现内外参数与NeRF的联合优化,并支持每幅图像对应独立的相机参数。首先,我们解决了内外参数联合优化中出现的耦合问题与退化情况。其次,基于所提出的解决方案,我们为多相机系统引入了一种高效的标定图像采集方案,包括标定物的设计。最后,我们提出一种包含训练序列的端到端网络,能够同时估计内外参数并完成渲染网络的训练。此外,鉴于现有数据集大多针对单一相机设计,我们构建了真实的多相机图像采集系统,并创建了包含仿真数据与真实拍摄图像的对应新数据集。实验证实了当每幅图像对应不同相机参数时,本方法的有效性。具体而言,我们在真实系统中使用多个具有不同内外参数的相机,在不提供初始位姿的情况下实现了三维场景的重建。