We study a multi-task decision-making problem for 360 video processing in a wireless multi-user virtual reality (VR) system that includes an edge computing unit (ECU) to deliver 360 videos to VR users and offer computing assistance for decoding/rendering of video frames. However, this comes at the expense of increased data volume and required bandwidth. To balance this trade-off, we formulate a constrained quality of experience (QoE) maximization problem in which the rebuffering time and quality variation between video frames are bounded by user and video requirements. To solve the formulated multi-user QoE maximization, we leverage deep reinforcement learning (DRL) for multi-task rate adaptation and computation distribution (MTRC). The proposed MTRC approach does not rely on any predefined assumption about the environment and relies on video playback statistics (i.e., past throughput, decoding time, transmission time, etc.), video information, and the resulting performance to adjust the video bitrate and computation distribution. We train MTRC with real-world wireless network traces and 360 video datasets to obtain evaluation results in terms of the average QoE, peak signal-to-noise ratio (PSNR), rebuffering time, and quality variation. Our results indicate that the MTRC improves the users' QoE compared to state-of-the-art rate adaptation algorithm. Specifically, we show a 5.97 dB to 6.44 dB improvement in PSNR, a 1.66X to 4.23X improvement in rebuffering time, and a 4.21 dB to 4.35 dB improvement in quality variation.
翻译:本文研究无线多用户虚拟现实(VR)系统中的360度视频处理多任务决策问题。该系统包含边缘计算单元(ECU),用于向VR用户传输360度视频,并为视频帧的解码/渲染提供计算辅助。然而,这会以增加数据量和所需带宽为代价。为平衡这一权衡,我们构建了一个受约束的体验质量(QoE)最大化问题,其中视频帧间的卡顿时间和质量波动受限于用户与视频需求。为解决这一多用户QoE最大化问题,我们采用深度强化学习(DRL)实现多任务码率自适应与计算分配(MTRC)。所提出的MTRC方法不依赖任何关于环境的预设假设,而是根据视频播放统计数据(即历史吞吐量、解码时间、传输时间等)、视频信息及最终性能表现来调整视频码率与计算分配。我们使用真实无线网络轨迹和360度视频数据集训练MTRC,并从平均QoE、峰值信噪比(PSNR)、卡顿时间和质量波动等方面获得评估结果。实验表明,相较于最先进的码率自适应算法,MTRC显著提升了用户的QoE体验。具体而言,PSNR提升了5.97 dB至6.44 dB,卡顿时间改善了1.66倍至4.23倍,质量波动减少了4.21 dB至4.35 dB。