Rec-Distill: An Industrial Distillation Pipeline for Large-Scale Recommendation Models

Haoran Ding,Wenlin Zhao,Yuchen Jiang,Juren Li,Jie Zhu,Xinchun Li,Yishujie Zhao,Yi Zhang,Ao Qiao,Jianhui Dong,Cheng Chen,Ziyan Gong,Deping Xie,Peng Xu,Zikai Wang,Yuwei Wang,Huizhi Yang,Zhe Chen,Yuchao Zheng

Large recommendation models have demonstrated substantial potential gains under scaling laws, yet these gains are difficult to realize in industrial recommendation systems because real-world deployment requires lightweight models with strict serving efficiency and latency guarantees. This creates a fundamental gap between offline model scaling and online deployment. In this work, we present Rec-Distill, an industrial distillation pipeline that transfers the performance gains of large-scale recommendation modeling to efficient serving models. Rec-Distill combines large-teacher scaling with student-side transfer optimization through decoupled training, black-box distillation, debiasing mechanism, and a hybrid batch-streaming pipeline for dynamic recommendation environments. Across multiple recommendation and advertising scenarios on real-world platforms, our framework scales teacher models up to 24B dense parameters and 20K behavior sequence length, while enabling lightweight students to recover a substantial portion of teacher gains, with distillation transferability exceeding 60% in the best setting. Extensive offline and online experiments further show that these transferred gains consistently translate into measurable business improvements under industrial constraints. These results demonstrate that Rec-Distill provides a practical framework for distilling large-scale recommendation models into deployable, cost-efficient serving systems, while also establishing a reliable path toward scaling recommendation models to even larger regimes in the future.

翻译：大规模推荐模型在缩放定律下展现出显著的性能潜力，然而这些增益在工业推荐系统中难以实现，因为实际部署要求模型具有轻量级结构、严格的推理效率与延迟保障。这导致了离线模型缩放与在线部署之间的根本性鸿沟。本文提出Rec-Distill——一种工业级蒸馏流水线，能将大规模推荐建模的性能增益转移至高效服务模型。通过解耦训练、黑盒蒸馏、去偏机制以及面向动态推荐环境的混合批流流水线，Rec-Distill实现了大规模教师模型缩放与学生端迁移优化的结合。在多个实际平台上的推荐与广告场景中，我们的框架将教师模型扩展至240亿稠密参数与20K行为序列长度，同时使轻量级学生模型能够恢复教师模型的大部分增益，最优设置下蒸馏可迁移性超过60%。大量离线与在线实验进一步表明，这些迁移的增益在工业约束条件下持续转化为可衡量的业务改进。这些结果证明，Rec-Distill提供了一种实用框架，能够将大规模推荐模型蒸馏为可部署、低成本的服务系统，同时也为未来将推荐模型扩展至更大规模建立了可靠路径。