Short video platforms like TikTok, Instagram Reels, and YouTube Shorts have gained immense popularity in the last few years and are responsible for a large and growing fraction of Internet traffic. We identify two unique opportunities for improving short video delivery using their existing interactions with content delivery networks (CDNs). First, short videos use a push-based recommendation system, where the user is presented a sequence of videos recommended by the algorithm rather than user explicitly picking content to watch (e.g., in YouTube). Such push-based short video systems offer a unique opportunity for system design by providing visibility into upcoming requests. Second, the popularity of these videos follows a highly skewed Pareto distribution, leading to geographical and temporal overlap amongst videos being served. We leverage these opportunities to build SILC - a lookahead-aware caching system, aimed at (i) reducing CDN cache miss rates, as well as (ii) reducing midgress bandwidth between the CDN and the origin server. Our evaluation of SILC uses traces that we collect from real users, through (i) an in-person user study, and (ii) a data donation program involving 100 TikTok users across the world. Using a combination of these traces, we simulate traffic from 10,000 simultaneous users. Our evaluation shows that, compared to 10 state-of-the-art heuristic and learning-based cache eviction policies, SILC reduces a CDN's midgress costs by 11.1% to 111%.
翻译:近年来,TikTok、Instagram Reels和YouTube Shorts等短视频平台迅速普及,其产生的互联网流量占比持续攀升且规模庞大。我们通过分析短视频平台与内容分发网络(CDN)的现有交互机制,发现了两个提升短视频分发效率的独特机遇。首先,短视频采用基于推送的推荐系统——用户观看的是算法推荐的一系列视频序列,而非像YouTube那样自主选择内容。这种推送机制为系统设计提供了独特优势:系统能够预知即将到来的请求序列。其次,短视频的流行度呈现高度偏斜的帕累托分布特性,导致被传输视频在时空维度上存在显著重叠。基于这两个特性,我们构建了SILC——一种具备预知能力的缓存系统,其目标在于:(i)降低CDN缓存缺失率,以及(ii)减少CDN与源服务器之间的骨干网带宽消耗。我们通过两种途径收集真实用户轨迹来评估SILC:(i)实地用户研究,以及(ii)覆盖全球100名TikTok用户的数据捐赠计划。基于这些轨迹数据,我们模拟了10,000名并发用户的流量场景。实验结果表明,与10种当前最优的基于启发式规则和机器学习的缓存淘汰策略相比,SILC可将CDN的骨干网传输成本降低11.1%至111%。