Inspired by the exceptional general intelligence of Large Language Models (LLMs), researchers have begun to explore their application in pioneering the next generation of recommender systems - systems that are conversational, explainable, and controllable. However, existing literature primarily concentrates on integrating domain-specific knowledge into LLMs to enhance accuracy, often neglecting the ability to follow instructions. To address this gap, we initially introduce a collection of supervised learning tasks, augmented with labels derived from a conventional recommender model, aimed at explicitly improving LLMs' proficiency in adhering to recommendation-specific instructions. Subsequently, we develop a reinforcement learning-based alignment procedure to further strengthen LLMs' aptitude in responding to users' intentions and mitigating formatting errors. Through extensive experiments on two real-world datasets, our method markedly advances the capability of LLMs to comply with instructions within recommender systems, while sustaining a high level of accuracy performance.
翻译:受大型语言模型(LLMs)卓越通用智能的启发,研究者们已开始探索将其应用于开创下一代推荐系统——具备对话、可解释和可控特性的系统。然而,现有文献主要集中于将领域特定知识整合至LLMs以提升准确性,往往忽视了遵循指令的能力。为填补这一空白,我们首先引入一系列监督学习任务,并通过传统推荐模型生成的标签进行增强,旨在显式提升LLMs遵循推荐特定指令的能力。随后,我们开发了一套基于强化学习的对齐流程,以进一步增强LLMs响应用户意图及减少格式错误的能力。通过在两个真实世界数据集上的广泛实验,我们的方法显著提升了LLMs在推荐系统中遵循指令的能力,同时保持了高水平的准确性性能。