Preloading is widely used in short video platforms to minimize playback stalls by downloading future content in advance. However, existing strategies face a tradeoff. Aggressive preloading reduces stalls but wastes bandwidth, while conservative strategies save data but increase the risk of playback stalls. This paper presents PromptPream, a computation powered preloading paradigm that breaks this tradeoff by using local computation to reduce bandwidth demand. Instead of transmitting pixel level video chunks, PromptPream sends compact semantic prompts that are decoded into high quality frames using generative models such as Stable Diffusion. We propose three core techniques to enable this paradigm: (1) a gradient based prompt inversion method that compresses frames into small sets of compact token embeddings; (2) a computation aware scheduling strategy that jointly optimizes network and compute resource usage; and (3) a scalable searching algorithm that addresses the enlarged scheduling space introduced by scheduler. Evaluations show that PromptStream reduces both stalls and bandwidth waste by over 31%, and improves Quality of Experience (QoE) by 45%, compared to traditional strategies.
翻译:预加载技术被广泛应用于短视频平台,通过提前下载未来内容以最小化播放卡顿。然而,现有策略面临一个权衡:激进的预加载能减少卡顿但浪费带宽,而保守的策略虽节省数据却增加了播放卡顿的风险。本文提出PromptPream,一种计算驱动的预加载范式,通过利用本地计算来降低带宽需求,从而打破这一权衡。PromptPream不传输像素级视频块,而是发送紧凑的语义提示,并使用如Stable Diffusion等生成模型将其解码为高质量帧。为实现此范式,我们提出了三项核心技术:(1)一种基于梯度的提示反转方法,将视频帧压缩成小型紧凑的令牌嵌入集;(2)一种计算感知的调度策略,联合优化网络和计算资源的使用;(3)一种可扩展的搜索算法,以应对调度器引入的扩大化调度空间。评估结果表明,与传统策略相比,PromptStream将卡顿和带宽浪费均降低了超过31%,并将体验质量提升了45%。