A key challenge of 360$^\circ$ VR video streaming is ensuring high quality with limited network bandwidth. Currently, most studies focus on tile-based adaptive bitrate streaming to reduce bandwidth consumption, where resources in network nodes are not fully utilized. This article proposes a tile-weighted rate-distortion (TWRD) packet scheduling optimization system to reduce data volume and improve video quality. A multimodal spatial-temporal attention transformer is proposed to predict viewpoint with probability that is used to dynamically weight tiles and corresponding packets. The packet scheduling problem of determining which packets should be dropped is formulated as an optimization problem solved by a dynamic programming solution. Experiment results demonstrate the proposed method outperforms the existing methods under various conditions.
翻译:360度VR视频流的一个关键挑战是在有限网络带宽下确保高质量。目前,大多数研究集中于基于分块的自适应比特率流媒体以减少带宽消耗,但网络节点中的资源未得到充分利用。本文提出了一种基于分块加权的率失真(TWRD)分组调度优化系统,以降低数据量并提升视频质量。我们提出了一种多模态时空注意力变换器(multimodal spatial-temporal attention transformer),用于预测视口概率,进而动态加权各分块及其对应的数据包。将确定哪些数据包应被丢弃的分组调度问题形式化为一个优化问题,并通过动态规划方法求解。实验结果表明,所提方法在多种条件下均优于现有方法。