Contemporary real-time video communication systems, such as WebRTC, use an adaptive bitrate (ABR) algorithm to assure high-quality and low-delay services, e.g., promptly adjusting video bitrate according to the instantaneous network bandwidth. However, target bitrate decisions in the network and bitrate control in the codec are typically incoordinated and simply ignoring the effect of inappropriate resolution and frame rate settings also leads to compromised results in bitrate control, thus devastatingly deteriorating the quality of experience (QoE). To tackle these challenges, Mamba, an end-to-end multi-dimensional ABR algorithm is proposed, which utilizes multi-agent reinforcement learning (MARL) to maximize the user's QoE by adaptively and collaboratively adjusting encoding factors including the quantization parameters (QP), resolution, and frame rate based on observed states such as network conditions and video complexity information in a video conferencing system. We also introduce curriculum learning to improve the training efficiency of MARL. Both the in-lab and real-world evaluation results demonstrate the remarkable efficacy of Mamba.
翻译:当代实时视频通信系统(如WebRTC)采用自适应码率(ABR)算法以确保高质量低延迟服务,例如根据瞬时网络带宽即时调整视频码率。然而,网络中的目标码率决策与编解码器中的码率控制通常缺乏协调,且忽略分辨率和帧率设置不当的影响会导致码率控制结果受损,从而严重恶化用户体验质量(QoE)。为应对这些挑战,本文提出端到端的多维ABR算法Mamba,该算法利用多智能体强化学习(MARL),通过基于网络状况和视频会议系统中的视频复杂度信息等观测状态,自适应协同调整量化参数(QP)、分辨率和帧率等编码因子,最大化用户QoE。我们还引入课程学习以提升MARL的训练效率。实验室测试与真实场景评估结果均表明Mamba具有显著有效性。