Instant-3D: Instant Neural Radiance Field Training Towards On-Device AR/VR 3D Reconstruction

Neural Radiance Field (NeRF) based 3D reconstruction is highly desirable for immersive Augmented and Virtual Reality (AR/VR) applications, but achieving instant (i.e., < 5 seconds) on-device NeRF training remains a challenge. In this work, we first identify the inefficiency bottleneck: the need to interpolate NeRF embeddings up to 200,000 times from a 3D embedding grid during each training iteration. To alleviate this, we propose Instant-3D, an algorithm-hardware co-design acceleration framework that achieves instant on-device NeRF training. Our algorithm decomposes the embedding grid representation in terms of color and density, enabling computational redundancy to be squeezed out by adopting different (1) grid sizes and (2) update frequencies for the color and density branches. Our hardware accelerator further reduces the dominant memory accesses for embedding grid interpolation by (1) mapping multiple nearby points' memory read requests into one during the feed-forward process, (2) merging embedding grid updates from the same sliding time window during back-propagation, and (3) fusing different computation cores to support the different grid sizes needed by the color and density branches of Instant-3D algorithm. Extensive experiments validate the effectiveness of Instant-3D, achieving a large training time reduction of 41x - 248x while maintaining the same reconstruction quality. Excitingly, Instant-3D has enabled instant 3D reconstruction for AR/VR, requiring a reconstruction time of only 1.6 seconds per scene and meeting the AR/VR power consumption constraint of 1.9 W.

翻译：基于神经辐射场（NeRF）的三维重建对于沉浸式增强现实与虚拟现实（AR/VR）应用极具吸引力，但实现设备端即时（即<5秒）的NeRF训练仍是一大挑战。本研究首先识别出效率瓶颈：每次训练迭代中，需从三维嵌入网格对NeRF嵌入进行高达20万次插值。为解决该问题，我们提出Instant-3D——一种算法-硬件协同设计加速框架，可实现在设备端即时NeRF训练。我们的算法将嵌入网格表示分解为颜色与密度两部分，通过为颜色和密度分支采用不同的（1）网格尺寸与（2）更新频率，消除计算冗余。硬件加速器进一步减少嵌入网格插值中的主导内存访问：（1）在前馈过程中将多个邻近点的内存读取请求合并为一次，（2）在反向传播中合并来自同一滑动时间窗口的嵌入网格更新，（3）融合不同计算核心以支持Instant-3D算法中颜色和密度分支所需的不同网格尺寸。大量实验验证了Instant-3D的有效性，在保持相同重建质量的前提下，实现了41倍至248倍的训练时间大幅缩减。令人振奋的是，Instant-3D已实现AR/VR的即时三维重建，每场景重建时间仅需1.6秒，且满足AR/VR功耗约束1.9瓦。

相关内容

三维重建

关注 1174

在计算机视觉中, 三维重建是指根据单视图或者多视图的图像重建三维信息的过程. 由于单视频的信息不完全,因此三维重建需要利用经验知识. 而多视图的三维重建(类似人的双目定位)相对比较容易, 其方法是先对摄像机进行标定, 即计算出摄像机的图象坐标系与世界坐标系的关系.然后利用多个二维图象中的信息重建出三维信息。物体三维重建是计算机辅助几何设计(CAGD)、计算机图形学(CG)、计算机动画、计算机视觉、医学图像处理、科学计算和虚拟现实、数字媒体创作等领域的共性科学问题和核心技术。在计算机内生成物体三维表示主要有两类方法。一类是使用几何建模软件通过人机交互生成人为控制下的物体三维几何模型,另一类是通过一定的手段获取真实物体的几何形状。前者实现技术已经十分成熟,现有若干软件支持,比如:3DMAX、Maya、AutoCAD、UG等等,它们一般使用具有数学表达式的曲线曲面表示几何形状。后者一般称为三维重建过程,三维重建是指利用二维投影恢复物体三维信息(形状等)的数学过程和计算机技术,包括数据获取、预处理、点云拼接和特征分析等步骤。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日