We address efficient and structure-aware 3D scene representation from images. Nerflets are our key contribution -- a set of local neural radiance fields that together represent a scene. Each nerflet maintains its own spatial position, orientation, and extent, within which it contributes to panoptic, density, and radiance reconstructions. By leveraging only photometric and inferred panoptic image supervision, we can directly and jointly optimize the parameters of a set of nerflets so as to form a decomposed representation of the scene, where each object instance is represented by a group of nerflets. During experiments with indoor and outdoor environments, we find that nerflets: (1) fit and approximate the scene more efficiently than traditional global NeRFs, (2) allow the extraction of panoptic and photometric renderings from arbitrary views, and (3) enable tasks rare for NeRFs, such as 3D panoptic segmentation and interactive editing.
翻译:我们通过图像解决高效且结构感知的3D场景表示问题。Nerflets是我们的核心贡献——一组共同表示场景的局部神经辐射场。每个nerflet维护其自身的空间位置、朝向和范围,并在该范围内对全景、密度和辐射重建作出贡献。仅利用光度监督和推断出的全景图像监督,我们可以直接联合优化一组nerflet的参数,从而形成场景的分解表示,其中每个物体实例由一组nerflet表示。在室内外环境的实验中,我们发现nerflet:(1)比传统全局NeRF更高效地拟合和近似场景,(2)允许从任意视角提取全景和光度渲染结果,(3)支持NeRF罕见任务,例如3D全景分割和交互式编辑。