Neural radiance fields (NeRF) have demonstrated the potential of coordinate-based neural representation (neural fields or implicit neural representation) in neural rendering. However, using a multi-layer perceptron (MLP) to represent a 3D scene or object requires enormous computational resources and time. There have been recent studies on how to reduce these computational inefficiencies by using additional data structures, such as grids or trees. Despite the promising performance, the explicit data structure necessitates a substantial amount of memory. In this work, we present a method to reduce the size without compromising the advantages of having additional data structures. In detail, we propose using the wavelet transform on grid-based neural fields. Grid-based neural fields are for fast convergence, and the wavelet transform, whose efficiency has been demonstrated in high-performance standard codecs, is to improve the parameter efficiency of grids. Furthermore, in order to achieve a higher sparsity of grid coefficients while maintaining reconstruction quality, we present a novel trainable masking approach. Experimental results demonstrate that non-spatial grid coefficients, such as wavelet coefficients, are capable of attaining a higher level of sparsity than spatial grid coefficients, resulting in a more compact representation. With our proposed mask and compression pipeline, we achieved state-of-the-art performance within a memory budget of 2 MB. Our code is available at https://github.com/daniel03c1/masked_wavelet_nerf.
翻译:神经辐射场(NeRF)在神经渲染中展现了基于坐标的神经表示(神经场或隐式神经表示)的潜力。然而,使用多层感知机(MLP)表示三维场景或对象需要巨大的计算资源和时间。近年来,有研究通过引入网格或树等额外数据结构来减少这些计算低效问题。尽管性能可观,但显式数据结构需要大量内存。本工作提出一种在保持额外数据结构优势的同时减少其规模的方法。具体而言,我们提议在基于网格的神经场上应用小波变换。基于网格的神经场用于快速收敛,而小波变换(其效率已在高性能标准编解码器中得到验证)则用于提升网格的参数效率。此外,为在保持重建质量的同时实现更高的网格系数稀疏性,我们提出一种新颖的可训练掩码方法。实验结果表明,与小波系数等非空间网格系数相比,空间网格系数能实现更高的稀疏性,从而产生更紧凑的表示。借助我们提出的掩码和压缩流程,在2 MB内存预算内实现了最先进的性能。我们的代码可在 https://github.com/daniel03c1/masked_wavelet_nerf 获取。