OneLive: Dynamically Unified Generative Framework for Live-Streaming Recommendation

Shen Wang,Yusheng Huang,Ruochen Yang,Shuang Wen,Pengbo Xu,Jiangxia Cao,Yueyang Liu,Kuo Cai,Chengcheng Guo,Shiyao Wang,Xinchen Luo,Qiang Luo,Ruiming Tang,Shuang Yang,Zhaojie Liu,Guorui Zhou,Han Li,Kun Gai

from arxiv, Work in progress

Live-streaming recommender system serves as critical infrastructure that bridges the patterns of real-time interactions between users and authors. Similar to traditional industrial recommender systems, live-streaming recommendation also relies on cascade architectures to support large-scale concurrency. Recent advances in generative recommendation unify the multi-stage recommendation process with Transformer-based architectures, offering improved scalability and higher computational efficiency. However, the inherent complexity of live-streaming prevents the direct transfer of these methods to live-streaming scenario, where continuously evolving content, limited lifecycles, strict real-time constraints, and heterogeneous multi-objectives introduce unique challenges that invalidate static tokenization and conventional model framework. To address these issues, we propose OneLive, a dynamically unified generative recommendation framework tailored for live-streaming scenario. OneLive integrates four key components: (i) A Dynamic Tokenizer that continuously encodes evolving real-time live content fused with behavior signal through residual quantization; (ii) A Time-Aware Gated Attention mechanism that explicitly models temporal dynamics for timely decision making; (iii) An efficient decoder-only generative architecture enhanced with Sequential MTP and QK Norm for stable training and accelerated inference; (iv) A Unified Multi-Objective Alignment Framework reinforces policy optimization for personalized preferences.

翻译：直播推荐系统作为关键基础设施，连接着用户与主播之间的实时交互模式。与传统工业推荐系统类似，直播推荐同样依赖级联架构以支持大规模并发。生成式推荐的最新进展通过基于Transformer的架构统一了多阶段推荐流程，提供了更优的可扩展性与更高的计算效率。然而，直播场景固有的复杂性阻碍了这些方法直接迁移至该领域：持续演进的内容、有限的生命周期、严格的实时约束以及异构多目标带来了独特挑战，使得静态标记化与常规模型框架不再适用。为解决这些问题，我们提出了OneLive——一个专为直播场景设计的动态统一生成式推荐框架。OneLive集成了四个核心组件：（i）动态标记器，通过残差量化持续编码融合行为信号的实时演进直播内容；（ii）时间感知门控注意力机制，显式建模时序动态以实现及时决策；（iii）采用序列化MTP与QK归一化增强的高效仅解码器生成架构，用于稳定训练与加速推理；（iv）统一多目标对齐框架，强化针对个性化偏好的策略优化。