In this work, we introduce SCALAR-NeRF, a novel framework tailored for scalable large-scale neural scene reconstruction. We structure the neural representation as an encoder-decoder architecture, where the encoder processes 3D point coordinates to produce encoded features, and the decoder generates geometric values that include volume densities of signed distances and colors. Our approach first trains a coarse global model on the entire image dataset. Subsequently, we partition the images into smaller blocks using KMeans with each block being modeled by a dedicated local model. We enhance the overlapping regions across different blocks by scaling up the bounding boxes of each local block. Notably, the decoder from the global model is shared across distinct blocks and therefore promoting alignment in the feature space of local encoders. We propose an effective and efficient methodology to fuse the outputs from these local models to attain the final reconstruction. Employing this refined coarse-to-fine strategy, our method outperforms state-of-the-art NeRF methods and demonstrates scalability for large-scale scene reconstruction. The code will be available on our project page at https://aibluefisher.github.io/SCALAR-NeRF/
翻译:本文提出SCALAR-NeRF,一种专为可扩展的大规模神经场景重建设计的新型框架。我们将神经表示构建为编码器-解码器架构:编码器处理三维点坐标以生成编码特征,解码器输出包含有符号距离体积密度和颜色的几何值。该方法首先在整个图像数据集上训练粗粒度全局模型,随后利用KMeans将图像划分为若干子块,每个子块由专用局部模型建模。通过缩放每个局部块的包围盒来增强跨块重叠区域。值得注意的是,全局模型的解码器在不同子块间共享,从而促进局部编码器特征空间的对齐。我们提出高效且有效的多局部模型输出融合方法,最终完成重建。采用这种精炼的由粗到细策略,该方法性能超越现有最优NeRF方法,并展现出大规模场景重建的可扩展性。代码将在项目页面发布:https://aibluefisher.github.io/SCALAR-NeRF/