OpenOneRec技术报告 (OpenOneRec Technical Report)

Guorui Zhou,Honghui Bao,Jiaming Huang,Jiaxin Deng,Jinghao Zhang,Junda She,Kuo Cai,Lejian Ren,Lu Ren,Qiang Luo,Qianqian Wang,Qigen Hu,Rongzhou Zhang,Ruiming Tang,Shiyao Wang,Wuchao Li,Xiangyu Wu,Xinchen Luo,Xingmei Wang,Yifei Hu,Yunfan Wu,Zhanyu Liu,Zhiyang Zhang,Zixing Zhang,Bo Chen,Bin Wen,Chaoyi Ma,Chengru Song,Chenglong Chu,Defu Lian,Fan Yang,Feng Jiang,Hongtao Cheng,Huanjie Wang,Kun Gai,Pengfei Zheng,Qiang Wang,Rui Huang,Siyang Mao,Tingting Gao,Wei Yuan,Yan Wang,Yang Zhou,Yi Su,Zexuan Cheng,Zhixin Ling,Ziming Li

While the OneRec series has successfully unified the fragmented recommendation pipeline into an end-to-end generative framework, a significant gap remains between recommendation systems and general intelligence. Constrained by isolated data, they operate as domain specialists-proficient in pattern matching but lacking world knowledge, reasoning capabilities, and instruction following. This limitation is further compounded by the lack of a holistic benchmark to evaluate such integrated capabilities. To address this, our contributions are: 1) RecIF Bench & Open Data: We propose RecIF-Bench, a holistic benchmark covering 8 diverse tasks that thoroughly evaluate capabilities from fundamental prediction to complex reasoning. Concurrently, we release a massive training dataset comprising 96 million interactions from 160,000 users to facilitate reproducible research. 2) Framework & Scaling: To ensure full reproducibility, we open-source our comprehensive training pipeline, encompassing data processing, co-pretraining, and post-training. Leveraging this framework, we demonstrate that recommendation capabilities can scale predictably while mitigating catastrophic forgetting of general knowledge. 3) OneRec-Foundation: We release OneRec Foundation (1.7B and 8B), a family of models establishing new state-of-the-art (SOTA) results across all tasks in RecIF-Bench. Furthermore, when transferred to the Amazon benchmark, our models surpass the strongest baselines with an average 26.8% improvement in Recall@10 across 10 diverse datasets (Figure 1). This work marks a step towards building truly intelligent recommender systems. Nonetheless, realizing this vision presents significant technical and theoretical challenges, highlighting the need for broader research engagement in this promising direction.

翻译：尽管OneRec系列已成功将碎片化的推荐流程统一为端到端的生成式框架，但推荐系统与通用智能之间仍存在显著差距。受限于孤立数据，现有系统仅能作为领域专家——擅长模式匹配，但缺乏世界知识、推理能力及指令遵循能力。这一局限性因缺乏评估此类综合能力的整体基准而进一步加剧。为此，我们的贡献包括：1) RecIF基准与开放数据：提出RecIF-Bench这一涵盖8类多样化任务的整体基准，全面评估从基础预测到复杂推理的能力。同时，我们发布了包含16万用户9600万交互的大规模训练数据集，以促进可复现研究。2) 框架与扩展：为确保完全可复现性，我们开源了涵盖数据处理、协同预训练与后训练的完整训练流程。基于该框架，我们证明了推荐能力可被可预测地扩展，同时缓解通用知识的灾难性遗忘。3) OneRec基础模型：发布了OneRec Foundation（1.7B和8B）模型系列，在RecIF-Bench所有任务中均取得最先进的性能。此外，当迁移至Amazon基准时，我们的模型在10个多样化数据集上的Recall@10平均提升26.8%（图1），超越了现有最强基线。本工作标志着向构建真正智能推荐系统迈出的一步。然而，实现该愿景仍面临重大技术与理论挑战，凸显了在这一前景广阔的方向上需要更广泛的研究投入。