We present a neural network architecture designed to naturally learn a positional embedding and overcome the spectral bias towards lower frequencies faced by conventional activation functions. Our proposed architecture, SPDER, is a simple MLP that uses an activation function composed of a sinusoidal multiplied by a sublinear function, called the damping function. The sinusoidal enables the network to automatically learn the positional embedding of an input coordinate while the damping passes on the actual coordinate value by preventing it from being projected down to within a finite range of values. Our results indicate that SPDERs speed up training by 10x and converge to losses 1,500-50,000x lower than that of the state-of-the-art for image representation. SPDER is also state-of-the-art in audio representation. The superior representation capability allows SPDER to also excel on multiple downstream tasks such as image super-resolution and video frame interpolation. We provide intuition as to why SPDER significantly improves fitting compared to that of other INR methods while requiring no hyperparameter tuning or preprocessing.
翻译:我们提出一种神经网络架构,旨在自然学习位置嵌入并克服传统激活函数面临的低频谱偏差。我们提出的SPDER架构是一种简单的多层感知机,其使用的激活函数由正弦函数与称为阻尼函数的次线性函数相乘构成。正弦函数使网络能够自动学习输入坐标的位置嵌入,而阻尼函数则通过防止坐标值被投影到有限值域内来传递实际坐标值。实验结果表明,SPDER将训练速度提升10倍,并在图像表示任务中达到比现有最优方法低1,500至50,000倍的损失值。在音频表示任务中SPDER同样达到最优性能。其卓越的表征能力使SPDER在图像超分辨率和视频帧插值等多个下游任务中也表现优异。我们通过理论分析阐明了SPDER相比其他隐式神经表示方法能显著提升拟合性能的原因,且该方法无需超参数调整或预处理。