Deep variational autoencoders for image and video compression have gained significant attraction in the recent years, due to their potential to offer competitive or better compression rates compared to the decades long traditional codecs such as AVC, HEVC or VVC. However, because of complexity and energy consumption, these approaches are still far away from practical usage in industry. More recently, implicit neural representation (INR) based codecs have emerged, and have lower complexity and energy usage to classical approaches at decoding. However, their performances are not in par at the moment with state-of-the-art methods. In this research, we first show that INR based image codec has a lower complexity than VAE based approaches, then we propose several improvements for INR-based image codec and outperformed baseline model by a large margin.
翻译:深度变分自编码器用于图像和视频压缩因其相较于传统编解码器(如AVC、HEVC或VVC)具有竞争性或更优压缩率的潜力,近年来引起了广泛关注。然而,由于复杂性和能耗问题,这些方法仍远未达到工业实际应用的水平。最近,基于隐式神经表征(INR)的编解码器应运而生,其在解码端具有比传统方法更低的复杂度和能耗。但目前其性能尚无法与最先进方法相媲美。在本研究中,我们首先证明了基于INR的图像编解码器复杂度低于基于VAE的方法,随后提出了若干针对INR图像编解码器的改进方案,并大幅超越了基线模型。