Understanding the Spectral Bias of Coordinate Based MLPs Via Training Dynamics

Spectral bias is an important observation of neural network training, stating that the network will learn a low frequency representation of the target function before converging to higher frequency components. This property is interesting due to its link to good generalization in over-parameterized networks. However, in applications to scene rendering, where multi-layer perceptrons (MLPs) with ReLU activations utilize dense, low dimensional coordinate based inputs, a severe spectral bias occurs that obstructs convergence to high freqeuncy components entirely. In order to overcome this limitation, one can encode the inputs using high frequency sinusoids. Previous works attempted to explain both spectral bias and its severity in the coordinate based regime using Neural Tangent Kernel (NTK) and Fourier analysis. However, such methods come with various limitations, since NTK does not capture real network dynamics, and Fourier analysis only offers a global perspective on the frequency components of the network. In this paper, we provide a novel approach towards understanding spectral bias by directly studying ReLU MLP training dynamics, in order to gain further insight on the properties that induce this behavior in the real network. Specifically, we focus on the connection between the computations of ReLU networks (activation regions), and the convergence of gradient descent. We study these dynamics in relation to the spatial information of the signal to provide a clearer understanding as to how they influence spectral bias, which has yet to be demonstrated. Additionally, we use this formulation to further study the severity of spectral bias in the coordinate based setting, and why positional encoding overcomes this.

翻译：谱偏差是神经网络训练中一个重要的观测现象，它表明网络在收敛到高频分量之前，会先学习目标函数的低频表征。这一特性之所以引人关注，是因为其与过参数化网络良好的泛化能力相关联。然而，在场景渲染等应用中，采用ReLU激活函数的多层感知机利用密集的低维坐标基输入时，会出现严重的谱偏差，完全阻碍其收敛到高频分量。为了克服这一限制，可以使用高频正弦波对输入进行编码。先前的研究试图通过神经正切核和傅里叶分析来解释谱偏差及其在坐标基场景下的严重性。然而，这些方法存在诸多局限性，因为神经正切核无法捕捉真实的网络动力学，而傅里叶分析仅能提供网络频率分量的全局视角。本文通过直接研究ReLU多层感知机的训练动力学，提出了一种理解谱偏差的新方法，以深入探究真实网络中诱发这一行为的特性。具体而言，我们聚焦于ReLU网络的计算过程与梯度下降收敛性之间的关联。我们结合信号的空间信息研究这些动力学特性，从而更清晰地揭示它们如何影响谱偏差——这一点此前尚未得到论证。此外，我们利用这一理论框架进一步研究了坐标基设置下谱偏差的严重性，以及位置编码为何能克服这一问题。