NineRec: A Benchmark Dataset Suite for Evaluating Transferable Recommendation

Large foundational models, through upstream pre-training and downstream fine-tuning, have achieved immense success in the broad AI community due to improved model performance and significant reductions in repetitive engineering. By contrast, the transferable one-for-all models in the recommender system field, referred to as TransRec, have made limited progress. The development of TransRec has encountered multiple challenges, among which the lack of large-scale, high-quality transfer learning recommendation dataset and benchmark suites is one of the biggest obstacles. To this end, we introduce NineRec, a TransRec dataset suite that comprises a large-scale source domain recommendation dataset and nine diverse target domain recommendation datasets. Each item in NineRec is accompanied by a descriptive text and a high-resolution cover image. Leveraging NineRec, we enable the implementation of TransRec models by learning from raw multimodal features instead of relying solely on pre-extracted off-the-shelf features. Finally, we present robust TransRec benchmark results with several classical network architectures, providing valuable insights into the field. To facilitate further research, we will release our code, datasets, benchmarks, and leaderboards at https://github.com/westlake-repl/NineRec.

翻译：大型基础模型通过上游预训练和下游微调，在整个人工智能领域取得了巨大成功，这得益于模型性能的提升以及重复性工程的显著减少。相比之下，推荐系统领域中称为TransRec的可迁移统一模型进展有限。TransRec的发展面临多重挑战，其中缺乏大规模、高质量的迁移学习推荐数据集与基准套件是最大的障碍之一。为此，我们提出NineRec——一个包含大规模源域推荐数据集与九个多样化目标域推荐数据集的TransRec数据集套件。NineRec中的每个项目均附有描述性文本和高分辨率封面图像。借助NineRec，我们能够通过直接学习原始多模态特征（而非仅依赖预提取的现成特征）来实现TransRec模型。最后，我们展示了基于多种经典网络架构的稳健TransRec基准结果，为该领域提供了宝贵见解。为促进后续研究，我们将于https://github.com/westlake-repl/NineRec公开代码、数据集、基准测试结果与排行榜。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日