In NeRF, a critical problem is to effectively estimate the occupancy to guide empty-space skipping and point sampling. Grid-based methods work well for small-scale scenes. However, on large-scale scenes, they are limited by predefined bounding boxes, grid resolutions, and high memory usage for grid updates, and thus struggle to speed up training for large-scale, irregularly bounded and complex urban scenes without sacrificing accuracy. In this paper, we propose to learn a continuous and compact large-scale occupancy network, which can classify 3D points as occupied or unoccupied points. We train this occupancy network end-to-end together with the radiance field in a self-supervised manner by three designs. First, we propose a novel imbalanced occupancy loss to regularize the occupancy network. It makes the occupancy network effectively control the ratio of unoccupied and occupied points, motivated by the prior that most of 3D scene points are unoccupied. Second, we design an imbalanced architecture containing a large scene network and a small empty space network to separately encode occupied and unoccupied points classified by the occupancy network. This imbalanced structure can effectively model the imbalanced nature of occupied and unoccupied regions. Third, we design an explicit density loss to guide the occupancy network, making the density of unoccupied points smaller. As far as we know, we are the first to learn a continuous and compact occupancy of large-scale NeRF by a network. In our experiments, our occupancy network can quickly learn more compact, accurate and smooth occupancy compared to the occupancy grid. With our learned occupancy as guidance for empty space skipping on challenging large-scale benchmarks, our method consistently obtains higher accuracy compared to the occupancy grid, and our method can speed up state-of-the-art NeRF methods without sacrificing accuracy.
翻译:在神经辐射场(NeRF)中,一个关键问题是如何有效估计占据场,以指导空区域跳过和点采样。基于网格的方法在小尺度场景中表现良好。然而,对于大尺度场景,它们受限于预定义的边界框、网格分辨率以及网格更新的高内存消耗,因此难以在不牺牲精度的情况下加速大规模、不规则边界且复杂的城市场景的训练。本文提出学习一个连续且紧凑的大尺度占据网络,该网络能够将三维点分类为占据点或非占据点。我们通过三项设计,以自监督的方式将该占据网络与辐射场进行端到端联合训练。首先,我们提出一种新颖的不平衡占据损失来正则化占据网络。该损失基于“大多数三维场景点是非占据点”这一先验,使占据网络能有效控制非占据点与占据点的比例。其次,我们设计了一种不平衡架构,包含一个大型场景网络和一个小型空区域网络,分别对经占据网络分类后的占据点与非占据点进行编码。这种不平衡结构能够有效建模占据区域与非占据区域的不平衡特性。第三,我们设计了一种显式密度损失来指导占据网络,使非占据点的密度更小。据我们所知,我们是首个通过网络学习大规模NeRF的连续且紧凑占据场的工作。在我们的实验中,与占据网格相比,我们的占据网络能够快速学习到更紧凑、更准确且更平滑的占据场。在具有挑战性的大尺度基准数据集上,使用我们学习到的占据场作为空区域跳过的指导,我们的方法相比占据网格始终获得更高的精度,并且能够在保持精度的同时加速最先进的NeRF方法。