The rise of new video modalities like virtual reality or autonomous driving has increased the demand for efficient multi-view video compression methods, both in terms of rate-distortion (R-D) performance and in terms of delay and runtime. While most recent stereo video compression approaches have shown promising performance, they compress left and right views sequentially, leading to poor parallelization and runtime performance. This work presents Low-Latency neural codec for Stereo video Streaming (LLSS), a novel parallel stereo video coding method designed for fast and efficient low-latency stereo video streaming. Instead of using a sequential cross-view motion compensation like existing methods, LLSS introduces a bidirectional feature shifting module to directly exploit mutual information among views and encode them effectively with a joint cross-view prior model for entropy coding. Thanks to this design, LLSS processes left and right views in parallel, minimizing latency; all while substantially improving R-D performance compared to both existing neural and conventional codecs.
翻译:虚拟现实或自动驾驶等新型视频模式的兴起,对高效多视角视频压缩方法(在率失真性能和延迟及运行时间方面)提出了更高需求。尽管近期大多数立体视频压缩方法展现了良好性能,但它们采用左右视角逐帧压缩的方式,导致并行化和运行时间表现不佳。本文提出一种面向立体视频流的新型并行编码方法——低延迟神经立体视频流编解码器(LLSS),专为快速高效的低延迟立体视频流设计。不同于现有方法使用的顺序式跨视角运动补偿,LLSS引入双向特征移位模块直接利用视角间的互信息,并通过联合跨视角先验模型进行熵编码以有效压缩。得益于这一设计,LLSS能够并行处理左右视角,大幅降低延迟;同时在率失真性能上显著优于现有神经编解码器和传统编解码器。