Implicit Neural Representations (INRs) have recently gained attention as a powerful approach for continuously representing signals such as images, videos, and 3D shapes using multilayer perceptrons (MLPs). However, MLPs are known to exhibit a low-frequency bias, limiting their ability to capture high-frequency details accurately. This limitation is typically addressed by incorporating high-frequency input embeddings or specialized activation layers. In this work, we demonstrate that these embeddings and activations are often configured with hyperparameters that perform well on average but are suboptimal for specific input signals under consideration, necessitating a costly grid search to identify optimal settings. Our key observation is that the initial frequency spectrum of an untrained model's output correlates strongly with the model's eventual performance on a given target signal. Leveraging this insight, we propose frequency shifting (or FreSh), a method that selects embedding hyperparameters to align the frequency spectrum of the model's initial output with that of the target signal. We show that this simple initialization technique improves performance across various neural representation methods and tasks, achieving results comparable to extensive hyperparameter sweeps but with only marginal computational overhead compared to training a single model with default hyperparameters.
翻译:隐式神经表征(INRs)最近作为一种利用多层感知机(MLPs)连续表示图像、视频和三维形状等信号的强大方法而受到关注。然而,已知 MLPs 存在低频偏好,限制了其准确捕捉高频细节的能力。这一局限性通常通过引入高频输入嵌入或专用激活层来解决。在本研究中,我们证明这些嵌入和激活层通常配置了在平均情况下表现良好、但对于所考虑的特定输入信号并非最优的超参数,因此需要进行昂贵的网格搜索以确定最佳设置。我们的关键观察是:未经训练的模型输出的初始频谱与模型在给定目标信号上的最终性能密切相关。基于这一洞见,我们提出了频率偏移(FreSh)方法,该方法通过选择嵌入超参数来使模型初始输出的频谱与目标信号的频谱对齐。我们证明,这种简单的初始化技术能够提升多种神经表征方法和任务上的性能,其效果可与广泛的超参数搜索相媲美,而计算开销仅略高于使用默认超参数训练单个模型。