Neural video compression (NVC) technologies have advanced rapidly in recent years, yielding state-of-the-art schemes such as DCVC-RT that offer superior compression efficiency to H.266/VVC and real-time encoding/decoding capabilities. Nonetheless, existing NVC schemes have several limitations, including inefficiency in dealing with disocclusion and new content, interframe error propagation and accumulation, among others. To eliminate these limitations, we borrow the idea from classic video coding schemes, which allow intra coding within inter-coded frames. With the intra coding tool enabled, disocclusion and new content are properly handled, and interframe error propagation is naturally intercepted without the need for manual refresh mechanisms. We present an NVC framework with unified intra and inter coding, where every frame is processed by a single model that is trained to perform intra/inter coding adaptively. Moreover, we propose a simultaneous two-frame compression design to exploit interframe redundancy not only forwardly but also backwardly. Experimental results show that our scheme outperforms DCVC-RT by an average of 10.7\% BD-rate reduction, delivers more stable bitrate and quality per frame, and retains real-time encoding/decoding performances. Code and models will be released.
翻译:近年来,神经视频压缩技术发展迅速,涌现出如DCVC-RT等先进方案,其压缩效率优于H.266/VVC标准,并具备实时编码/解码能力。然而,现有神经视频压缩方案仍存在若干局限,包括处理遮挡解除与新内容时的低效性、帧间误差传播与累积等问题。为消除这些局限,我们借鉴经典视频编码方案中允许在帧间编码帧内进行帧内编码的思路。通过启用帧内编码工具,遮挡解除与新内容得以妥善处理,帧间误差传播亦被自然阻断,无需人工刷新机制。我们提出了一种统一帧内与帧间编码的神经视频压缩框架,其中每一帧均由单一模型处理,该模型经训练后可自适应执行帧内/帧间编码。此外,我们设计了一种同步双帧压缩方案,不仅前向利用帧间冗余,还实现了后向冗余利用。实验结果表明,本方案平均较DCVC-RT降低10.7%的BD码率,提供更稳定的逐帧码率与质量,并保持实时编码/解码性能。代码与模型将公开释放。