Neural Radiance Field (NeRF) based 3D reconstruction is highly desirable for immersive Augmented and Virtual Reality (AR/VR) applications, but achieving instant (i.e., < 5 seconds) on-device NeRF training remains a challenge. In this work, we first identify the inefficiency bottleneck: the need to interpolate NeRF embeddings up to 200,000 times from a 3D embedding grid during each training iteration. To alleviate this, we propose Instant-3D, an algorithm-hardware co-design acceleration framework that achieves instant on-device NeRF training. Our algorithm decomposes the embedding grid representation in terms of color and density, enabling computational redundancy to be squeezed out by adopting different (1) grid sizes and (2) update frequencies for the color and density branches. Our hardware accelerator further reduces the dominant memory accesses for embedding grid interpolation by (1) mapping multiple nearby points' memory read requests into one during the feed-forward process, (2) merging embedding grid updates from the same sliding time window during back-propagation, and (3) fusing different computation cores to support the different grid sizes needed by the color and density branches of Instant-3D algorithm. Extensive experiments validate the effectiveness of Instant-3D, achieving a large training time reduction of 41x - 248x while maintaining the same reconstruction quality. Excitingly, Instant-3D has enabled instant 3D reconstruction for AR/VR, requiring a reconstruction time of only 1.6 seconds per scene and meeting the AR/VR power consumption constraint of 1.9 W.
翻译:基于神经辐射场(NeRF)的三维重建对于沉浸式增强现实与虚拟现实(AR/VR)应用具有极高价值,然而如何在设备端实现即时(即<5秒)NeRF训练仍是一项挑战。本文首先识别出效率瓶颈:每次训练迭代中需从三维嵌入网格对NeRF嵌入进行高达20万次插值运算。为缓解该问题,我们提出Instant-3D——一种算法-硬件协同设计加速框架,可实现设备端即时NeRF训练。该算法将嵌入网格表示分解为颜色与密度分量,通过为颜色分支与密度分支采用不同的(1)网格尺寸和(2)更新频率来消除计算冗余。所设计的硬件加速器进一步减少嵌入网格插值的核心访存开销,具体措施包括:(1)在前馈过程中将多个邻近点的内存读取请求合并为一次;(2)在反向传播时合并同一滑动时间窗口内的嵌入网格更新;(3)融合不同计算核心以支持Instant-3D算法中颜色与密度分支所需的差异化网格尺寸。大量实验验证了Instant-3D的有效性,在保持相同重建质量的前提下实现了41倍至248倍的训练时间缩减。令人振奋的是,Instant-3D已实现AR/VR即时三维重建——每场景重建时间仅需1.6秒,且功耗满足AR/VR设备约束(1.9 W)。