In recent years, the integration of Large Language Models (LLMs) into recommender systems has garnered interest among both practitioners and researchers. Despite this interest, the field is still emerging, and the lack of open-source R&D platforms may impede the exploration of LLM-based recommendations. This paper introduces OpenP5, an open-source platform designed as a resource to facilitate the development, training, and evaluation of LLM-based generative recommender systems for research purposes. The platform is implemented using encoder-decoder LLMs (e.g., T5) and decoder-only LLMs (e.g., Llama-2) across 10 widely recognized public datasets, catering to two fundamental recommendation tasks: sequential and straightforward recommendations. Recognizing the crucial role of item IDs in LLM-based recommendations, we have also incorporated three item indexing methods within the OpenP5 platform: random indexing, sequential indexing and collaborative indexing. Built on the Transformers library, the platform facilitates easy customization of LLM-based recommendations for users. OpenP5 boasts a range of features including extensible data processing, task-centric optimization, comprehensive datasets and checkpoints, efficient acceleration, and standardized evaluations, making it a valuable tool for the implementation and evaluation of LLM-based recommender systems. The open-source code and pre-trained checkpoints for the OpenP5 library are publicly available at https://github.com/agiresearch/OpenP5.
翻译:近年来,将大语言模型(LLMs)整合到推荐系统中已引起从业人员和研究人员的广泛关注。尽管这一领域备受瞩目,但仍处于发展初期,缺乏开源研发平台可能阻碍基于LLM推荐方向的探索。本文介绍OpenP5这一开源平台,旨在为研究者开发、训练和评估基于LLM的生成式推荐系统提供基础资源。该平台采用编码器-解码器LLM(如T5)和仅解码器LLM(如Llama-2),在10个广泛使用的公开数据集上实现,涵盖序列推荐与直接推荐两种基本任务。鉴于物品标识符在基于LLM推荐中的关键作用,我们在OpenP5平台中集成了三种物品索引方法:随机索引、顺序索引和协同索引。基于Transformers库构建的平台支持用户便捷定制基于LLM的推荐系统。OpenP5具备可扩展数据处理、任务中心化优化、全面数据集与检查点、高效加速及标准化评估等特性,是实施和评估基于LLM推荐系统的有力工具。OpenP5库的开源代码与预训练检查点已在https://github.com/agiresearch/OpenP5公开。