Short video applications have attracted billions of users on the Internet and can satisfy diverse users' fragmented spare time with content-rich and duration-short videos. To achieve fast playback at user side, existing short video systems typically enforce burst transmission of initial segment of each video when being requested for improved quality of user experiences. However, such a way of burst transmissions can cause unexpected large startup delays at user side. This is because users may frequently switch videos when sequentially watching a list of short videos recommended by the server side, which can cause excessive burst transmissions of initial segments of different short videos and thus quickly deplete the network transmission capacity. In this paper, we adopt token bucket to characterize the video transmission path between video server and each user, and accordingly study how to effectively reduce the startup delay of short videos by effectively arranging the viewing order of a video list at the server side. We formulate the optimal video ordering problem for minimizing the maximum video startup delay as a combinatorial optimization problem and prove its NP-hardness. We accordingly propose a Partially Shared Actor Critic reinforcement learning algorithm (PSAC) to learn optimized video ordering strategy. Numerical results based on a real dataset provided by a large-scale short video service provider demonstrate that the proposed PSAC algorithm can significantly reduce the video startup delay compared to baseline algorithms.
翻译:短视频应用已吸引互联网上数十亿用户,能够通过内容丰富且时长较短的视频满足用户多样化的碎片化空闲时间需求。为实现用户端的快速播放,现有短视频系统通常在请求视频时对其初始段进行突发传输,以提升用户体验质量。然而,这种突发传输方式可能导致用户端出现意外的较大启动延迟。这是因为用户在顺序观看服务器推荐的一系列短视频时可能频繁切换视频,导致不同短视频初始段的过度突发传输,从而迅速耗尽网络传输容量。本文采用令牌桶模型描述视频服务器与各用户之间的视频传输路径,进而研究如何通过有效排序服务器端的视频列表来降低短视频启动延迟。我们将最小化最大视频启动延迟的最优排序问题建模为组合优化问题,并证明其NP困难性。据此,我们提出一种部分共享的演员-评论家强化学习算法(PSAC),以学习优化的视频排序策略。基于大型短视频服务提供商真实数据集的数值结果表明,与基线算法相比,所提出的PSAC算法能够显著降低视频启动延迟。