Neural Radiance Field (NeRF) is widely seen as an alternative to traditional physically-based rendering. However, NeRF has not yet seen its adoption in resource-limited mobile systems such as Virtual and Augmented Reality (VR/AR), because it is simply extremely slow. On a mobile Volta GPU, even the state-of-the-art NeRF models generally execute only at 0.8 FPS. We show that the main performance bottlenecks are both algorithmic and architectural. We introduce, CICERO, to tame both forms of inefficiencies. We first introduce two algorithms, one fundamentally reduces the amount of work any NeRF model has to execute, and the other eliminates irregular DRAM accesses. We then describe an on-chip data layout strategy that eliminates SRAM bank conflicts. A pure software implementation of CICERO offers an 8.0x speed-up and 7.9x energy saving over a mobile Volta GPU. When compared to a baseline with a dedicated DNN accelerator, our speed-up and energy reduction increase to 28.2x and 37.8x, respectively - all with minimal quality loss (less than 1.0 dB peak signal-to-noise ratio reduction).
翻译:神经辐射场(NeRF)被广泛视为传统物理渲染的替代方案。然而,NeRF尚未在虚拟现实/增强现实(VR/AR)等资源受限的移动系统中得到应用,原因在于其运行速度极慢。在移动Volta GPU上,即便是最先进的NeRF模型通常也仅能以0.8 FPS运行。我们发现主要性能瓶颈既源于算法层面,也源于架构层面。我们提出CICERO来解决这两类低效问题。首先引入两种算法:一种从本质上减少任何NeRF模型所需执行的计算量,另一种消除非规则DRAM访问。随后描述了一种消除SRAM存储体冲突的片上数据布局策略。纯软件实现的CICERO在移动Volta GPU上实现了8.0倍加速和7.9倍能耗节约。与配备专用DNN加速器的基线方案相比,加速比和能耗节约分别提升至28.2倍和37.8倍——所有改进均以极小的质量损失(峰值信噪比降低小于1.0 dB)为代价。