Neural Radiance Fields (NeRF) give rise to learning-based 3D reconstruction methods widely used in industrial applications. Although prevalent methods achieve considerable improvements in small-scale scenes, accomplishing reconstruction in complex and large-scale scenes is still challenging. First, the background in complex scenes shows a large variance among different views. Second, the current inference pattern, $i.e.$, a pixel only relies on an individual camera ray, fails to capture contextual information. To solve these problems, we propose to enlarge the ray perception field and build up the sample points interactions. In this paper, we design a novel inference pattern that encourages a single camera ray possessing more contextual information, and models the relationship among sample points on each camera ray. To hold contextual information,a camera ray in our proposed method can render a patch of pixels simultaneously. Moreover, we replace the MLP in neural radiance field models with distance-aware convolutions to enhance the feature propagation among sample points from the same camera ray. To summarize, as a torchlight, a ray in our proposed method achieves rendering a patch of image. Thus, we call the proposed method, Torch-NeRF. Extensive experiments on KITTI-360 and LLFF show that the Torch-NeRF exhibits excellent performance.
翻译:神经辐射场(NeRF)催生了广泛应用于工业领域的基于学习的3D重建方法。尽管主流方法在小规模场景中取得了显著改进,但在复杂大规模场景中实现重建仍然具有挑战性。首先,复杂场景中的背景在不同视角之间表现出巨大差异。其次,当前的推理模式(即像素仅依赖单条相机光线)无法捕获上下文信息。为解决这些问题,我们提出扩大光线感知场并构建采样点交互。本文设计了一种新颖的推理模式,该模式鼓励单条相机光线拥有更多上下文信息,并建模每条相机光线上采样点之间的关系。为了承载上下文信息,我们方法中的相机光线可以同时渲染像素块。此外,我们使用距离感知卷积替代神经辐射场模型中的MLP,以增强来自同一条相机光线的采样点之间的特征传播。总的来说,像火炬一样,我们方法中的光线能够实现图像块的渲染。因此,我们将所提方法命名为Torch-NeRF。在KITTI-360和LLFF上的大量实验表明,Torch-NeRF表现出卓越性能。