Implicit Neural Representations (INRs) have emerged as a promising paradigm for video compression. However, existing INR-based frameworks typically suffer from inherent spectral bias, which favors low-frequency components and leads to over-smoothed reconstructions and suboptimal rate-distortion performance. In this paper, we propose FaNeRV, a Frequency-aware Neural Representation for videos, which explicitly decouples low- and high-frequency components to enable efficient and faithful video reconstruction. FaNeRV introduces a multi-resolution supervision strategy that guides the network to progressively capture global structures and fine-grained textures through staged supervision . To further enhance high-frequency reconstruction, we propose a dynamic high-frequency injection mechanism that adaptively emphasizes challenging regions. In addition, we design a frequency-decomposed network module to improve feature modeling across different spectral bands. Extensive experiments on standard benchmarks demonstrate that FaNeRV significantly outperforms state-of-the-art INR methods and achieves competitive rate-distortion performance against traditional codecs.
翻译:隐式神经表示已成为视频压缩领域一种前景广阔的范式。然而,现有基于INR的框架通常存在固有的频谱偏差,即倾向于低频分量,导致重建结果过度平滑且率失真性能欠佳。本文提出FaNeRV——一种频率感知的神经视频表示方法,通过显式解耦低频与高频分量,实现高效且保真的视频重建。FaNeRV引入多分辨率监督策略,通过分阶段监督引导网络逐步捕获全局结构与细粒度纹理。为进一步提升高频重建质量,我们提出动态高频注入机制,自适应地强化困难区域的细节恢复。此外,我们设计了频率分解网络模块以改进跨频谱带的特征建模。在标准基准测试上的大量实验表明,FaNeRV显著优于当前最先进的INR方法,并在率失真性能上与传统编解码器相比具有竞争力。