Large recommendation models have demonstrated substantial potential gains under scaling laws, yet these gains are difficult to realize in industrial recommendation systems because real-world deployment requires lightweight models with strict serving efficiency and latency guarantees. This creates a fundamental gap between offline model scaling and online deployment. In this work, we present Rec-Distill, an industrial distillation pipeline that transfers the performance gains of large-scale recommendation modeling to efficient serving models. Rec-Distill combines large-teacher scaling with student-side transfer optimization through decoupled training, black-box distillation, debiasing mechanism, and a hybrid batch-streaming pipeline for dynamic recommendation environments. Across multiple recommendation and advertising scenarios on real-world platforms, our framework scales teacher models up to 24B dense parameters and 20K behavior sequence length, while enabling lightweight students to recover a substantial portion of teacher gains, with distillation transferability exceeding 60% in the best setting. Extensive offline and online experiments further show that these transferred gains consistently translate into measurable business improvements under industrial constraints. These results demonstrate that Rec-Distill provides a practical framework for distilling large-scale recommendation models into deployable, cost-efficient serving systems, while also establishing a reliable path toward scaling recommendation models to even larger regimes in the future.
翻译:大规模推荐模型在扩展律下展现出显著的潜力增益,然而这些增益在工业推荐系统中难以实现,原因是实际部署要求模型轻量化且需严格保障服务效率与延迟。这导致了离线模型扩展与在线部署之间的根本性鸿沟。本文提出Rec-Distill,一种将大规模推荐建模的性能增益迁移至高效服务模型的工业蒸馏流水线。Rec-Distill通过解耦训练、黑盒蒸馏、去偏机制及面向动态推荐场景的混合批流流水线,将大规模教师模型扩展与学生端迁移优化相结合。在真实平台的多个推荐与广告场景中,我们的框架可将教师模型扩展至240亿稠密参数与2万行为序列长度,同时使轻量级学生模型恢复教师模型的大部分增益——最佳设置下蒸馏可迁移性超过60%。广泛的离线与在线实验进一步表明,这些迁移增益在工业约束下可持续转化为可量化的业务改进。这些结果证明,Rec-Distill提供了一种将大规模推荐模型蒸馏为可部署、高性价比服务系统的实用框架,同时也为未来将推荐模型扩展至更大规模建立了可靠路径。