GRACE: Loss-Resilient Real-Time Video through Neural Codecs

In real-time video communication, retransmitting lost packets over high-latency networks is not viable due to strict latency requirements. To counter packet losses without retransmission, two primary strategies are employed -- encoder-based forward error correction (FEC) and decoder-based error concealment. The former encodes data with redundancy before transmission, yet determining the optimal redundancy level in advance proves challenging. The latter reconstructs video from partially received frames, but dividing a frame into independently coded partitions inherently compromises compression efficiency, and the lost information cannot be effectively recovered by the decoder without adapting the encoder. We present a loss-resilient real-time video system called GRACE, which preserves the user's quality of experience (QoE) across a wide range of packet losses through a new neural video codec. Central to GRACE's enhanced loss resilience is its joint training of the neural encoder and decoder under a spectrum of simulated packet losses. In lossless scenarios, GRACE achieves video quality on par with conventional codecs (e.g., H.265). As the loss rate escalates, GRACE exhibits a more graceful, less pronounced decline in quality, consistently outperforming other loss-resilient schemes. Through extensive evaluation on various videos and real network traces, we demonstrate that GRACE reduces undecodable frames by 95% and stall duration by 90% compared with FEC, while markedly boosting video quality over error concealment methods. In a user study with 240 crowdsourced participants and 960 subjective ratings, GRACE registers a 38% higher mean opinion score (MOS) than other baselines.

翻译：在实时视频通信中，由于严格的延迟要求，在高延迟网络上重传丢失的数据包并不可行。为在不重传的情况下应对数据包丢失，主要采用两种策略——基于编码器的前向纠错（FEC）和基于解码器的错误隐藏。前者在传输前对数据编码时添加冗余，但提前确定最佳冗余级别具有挑战性。后者从部分接收的帧中重建视频，但将一帧划分为独立编码的分区本质上会损害压缩效率，且若不对编码器进行调整，解码器无法有效恢复丢失信息。我们提出一种名为GRACE的抗丢包实时视频系统，它通过新型神经视频编解码器在广泛的丢包率范围内保持用户质量体验（QoE）。GRACE增强抗丢包能力的核心在于，其联合训练神经编码器和解码器时考虑了模拟的多种丢包场景。在无丢包情况下，GRACE的视频质量与传统编解码器（如H.265）相当。随着丢包率上升，GRACE表现出更平滑、更不显著的质量下降，始终优于其他抗丢包方案。通过对多种视频和真实网络轨迹的广泛评估，我们证明与FEC相比，GRACE将不可解码帧减少95%，停顿时长减少90%，同时显著提升视频质量以超越错误隐藏方法。在包含240名众包参与者及960次主观评分的用户研究中，GRACE的MOS（平均意见得分）比其他基线方法高出38%。