Large foundational models, through upstream pre-training and downstream fine-tuning, have achieved immense success in the broad AI community due to improved model performance and significant reductions in repetitive engineering. By contrast, the transferable one-for-all models in the recommender system field, referred to as TransRec, have made limited progress. The development of TransRec has encountered multiple challenges, among which the lack of large-scale, high-quality transfer learning recommendation dataset and benchmark suites is one of the biggest obstacles. To this end, we introduce NineRec, a TransRec dataset suite that comprises a large-scale source domain recommendation dataset and nine diverse target domain recommendation datasets. Each item in NineRec is accompanied by a descriptive text and a high-resolution cover image. Leveraging NineRec, we enable the implementation of TransRec models by learning from raw multimodal features instead of relying solely on pre-extracted off-the-shelf features. Finally, we present robust TransRec benchmark results with several classical network architectures, providing valuable insights into the field. To facilitate further research, we will release our code, datasets, benchmarks, and leaderboards at https://github.com/westlake-repl/NineRec.
翻译:大型基础模型通过上游预训练和下游微调,在整个人工智能领域取得了巨大成功,这得益于模型性能的提升以及重复性工程的显著减少。相比之下,推荐系统领域中称为TransRec的可迁移统一模型进展有限。TransRec的发展面临多重挑战,其中缺乏大规模、高质量的迁移学习推荐数据集与基准套件是最大的障碍之一。为此,我们提出NineRec——一个包含大规模源域推荐数据集与九个多样化目标域推荐数据集的TransRec数据集套件。NineRec中的每个项目均附有描述性文本和高分辨率封面图像。借助NineRec,我们能够通过直接学习原始多模态特征(而非仅依赖预提取的现成特征)来实现TransRec模型。最后,我们展示了基于多种经典网络架构的稳健TransRec基准结果,为该领域提供了宝贵见解。为促进后续研究,我们将于https://github.com/westlake-repl/NineRec公开代码、数据集、基准测试结果与排行榜。