Recommender systems (RS) have achieved significant success by leveraging explicit identification (ID) features. However, the full potential of content features, especially the pure image pixel features, remains relatively unexplored. The limited availability of large, diverse, and content-driven image recommendation datasets has hindered the use of raw images as item representations. In this regard, we present PixelRec, a massive image-centric recommendation dataset that includes approximately 200 million user-image interactions, 30 million users, and 400,000 high-quality cover images. By providing direct access to raw image pixels, PixelRec enables recommendation models to learn item representation directly from them. To demonstrate its utility, we begin by presenting the results of several classical pure ID-based baseline models, termed IDNet, trained on PixelRec. Then, to show the effectiveness of the dataset's image features, we substitute the itemID embeddings (from IDNet) with a powerful vision encoder that represents items using their raw image pixels. This new model is dubbed PixelNet.Our findings indicate that even in standard, non-cold start recommendation settings where IDNet is recognized as highly effective, PixelNet can already perform equally well or even better than IDNet. Moreover, PixelNet has several other notable advantages over IDNet, such as being more effective in cold-start and cross-domain recommendation scenarios. These results underscore the importance of visual features in PixelRec. We believe that PixelRec can serve as a critical resource and testing ground for research on recommendation models that emphasize image pixel content. The dataset, code, and leaderboard will be available at https://github.com/westlake-repl/PixelRec.
翻译:推荐系统(RS)通过利用显式标识(ID)特征取得了显著成功。然而,内容特征,特别是纯图像像素特征的潜力,仍相对未被充分开发。大规模、多样化且以内容为导向的图像推荐数据集有限,这阻碍了将原始图像作为物品表示的使用。针对这一问题,我们提出了PixelRec,一个大规模以图像为中心的推荐数据集,包含约2亿条用户-图像交互记录、3000万用户和40万张高质量封面图像。通过直接提供原始图像像素,PixelRec使推荐模型能够直接学习物品表示。为展示其实用性,我们首先基于PixelRec训练了几个经典纯ID基线模型(称为IDNet)的结果。接着,为了展示数据集图像特征的有效性,我们用强大的视觉编码器替代IDNet中的物品ID嵌入,该编码器利用原始图像像素表示物品。我们将此新模型称为PixelNet。我们的研究结果表明,即使在IDNet被认为非常有效的标准非冷启动推荐设置中,PixelNet的性能也能与之持平甚至更优。此外,PixelNet相比IDNet还有其他显著优势,例如在冷启动和跨域推荐场景中效果更佳。这些结果凸显了PixelRec中视觉特征的重要性。我们相信PixelRec能成为研究强调图像像素内容的推荐模型的关键资源和测试平台。数据集、代码和排行榜将在https://github.com/westlake-repl/PixelRec上提供。