The multi-plane representation has been highlighted for its fast training and inference across static and dynamic neural radiance fields. This approach constructs relevant features via projection onto learnable grids and interpolating adjacent vertices. However, it has limitations in capturing low-frequency details and tends to overuse parameters for low-frequency features due to its bias toward fine details, despite its multi-resolution concept. This phenomenon leads to instability and inefficiency when training poses are sparse. In this work, we propose a method that synergistically integrates multi-plane representation with a coordinate-based network known for strong bias toward low-frequency signals. The coordinate-based network is responsible for capturing low-frequency details, while the multi-plane representation focuses on capturing fine-grained details. We demonstrate that using residual connections between them seamlessly preserves their own inherent properties. Additionally, the proposed progressive training scheme accelerates the disentanglement of these two features. We empirically show that the proposed method achieves comparable results to explicit encoding with fewer parameters, and particularly, it outperforms others for the static and dynamic NeRFs under sparse inputs.
翻译:多平面表示因其在静态和动态神经辐射场中实现快速训练和推理而备受关注。该方法通过投影至可学习网格并插值相邻顶点来构建相关特征。然而,尽管其具备多分辨率概念,但在捕捉低频细节方面存在局限性,且由于对高频细节的偏好,容易过度使用参数建模低频特征。当训练位姿稀疏时,这一现象会导致训练不稳定且效率低下。本文提出一种协同融合多平面表示与强低频偏置坐标网络的方法:坐标网络负责捕捉低频细节,而多平面表示专注于获取细粒度细节。我们证明,两者间使用残差连接可完美保留其固有特性。此外,所提出的渐进式训练策略加速了这两种特征的解耦。实验表明,该方法以更少参数取得与显式编码相当的效果,尤其在稀疏输入条件下,其性能超越其他静态与动态NeRF方法。