Neural video compression (NVC) technologies have advanced rapidly in recent years, yielding state-of-the-art schemes such as DCVC-RT that offer superior compression efficiency to H.266/VVC and real-time encoding/decoding capabilities. Nonetheless, existing NVC schemes have several limitations, including inefficiency in dealing with disocclusion and new content, interframe error propagation and accumulation, among others. To eliminate these limitations, we borrow the idea from classic video coding schemes, which allow intra coding within inter-coded frames. With the intra coding tool enabled, disocclusion and new content are properly handled, and interframe error propagation is naturally intercepted without the need for manual refresh mechanisms. We present an NVC framework with unified intra and inter coding, where every frame is processed by a single model that is trained to perform intra/inter coding adaptively. Moreover, we propose a simultaneous two-frame compression design to exploit interframe redundancy not only forwardly but also backwardly. Experimental results show that our scheme outperforms DCVC-RT by an average of 12.1% BD-rate reduction, delivers more stable bitrate and quality per frame, and retains real-time encoding/decoding performances. Code and models will be released.
翻译:近年来,神经视频压缩(NVC)技术发展迅速,催生了诸如DCVC-RT等先进方案,其压缩效率优于H.266/VVC标准,并具备实时编码/解码能力。然而,现有NVC方案存在若干局限,包括处理遮挡解除与新内容时的低效性、帧间误差传播与累积等问题。为消除这些局限,我们借鉴经典视频编码方案的思想,允许在帧间编码帧内进行帧内编码。启用帧内编码工具后,遮挡解除与新内容得以妥善处理,帧间误差传播被自然截断,无需人工刷新机制。我们提出了一种统一帧内与帧间编码的NVC框架,其中每一帧均由单一模型处理,该模型经训练可自适应执行帧内/帧间编码。此外,我们提出了一种同步双帧压缩设计,以同时利用前向与后向的帧间冗余。实验结果表明,我们的方案平均比DCVC-RT降低12.1%的BD-rate,提供更稳定的每帧码率与质量,并保持实时编码/解码性能。代码与模型将公开。