A Large Language Model Enhanced Conversational Recommender System

Conversational recommender systems (CRSs) aim to recommend high-quality items to users through a dialogue interface. It usually contains multiple sub-tasks, such as user preference elicitation, recommendation, explanation, and item information search. To develop effective CRSs, there are some challenges: 1) how to properly manage sub-tasks; 2) how to effectively solve different sub-tasks; and 3) how to correctly generate responses that interact with users. Recently, Large Language Models (LLMs) have exhibited an unprecedented ability to reason and generate, presenting a new opportunity to develop more powerful CRSs. In this work, we propose a new LLM-based CRS, referred to as LLMCRS, to address the above challenges. For sub-task management, we leverage the reasoning ability of LLM to effectively manage sub-task. For sub-task solving, we collaborate LLM with expert models of different sub-tasks to achieve the enhanced performance. For response generation, we utilize the generation ability of LLM as a language interface to better interact with users. Specifically, LLMCRS divides the workflow into four stages: sub-task detection, model matching, sub-task execution, and response generation. LLMCRS also designs schema-based instruction, demonstration-based instruction, dynamic sub-task and model matching, and summary-based generation to instruct LLM to generate desired results in the workflow. Finally, to adapt LLM to conversational recommendations, we also propose to fine-tune LLM with reinforcement learning from CRSs performance feedback, referred to as RLPF. Experimental results on benchmark datasets show that LLMCRS with RLPF outperforms the existing methods.

翻译：对话推荐系统旨在通过对话界面向用户推荐高质量物品。它通常包含多个子任务，例如用户偏好收集、推荐、解释和物品信息搜索。为开发有效的对话推荐系统，存在一些挑战：1）如何恰当管理子任务；2）如何有效解决不同子任务；以及3）如何正确生成与用户交互的回复。近年来，大型语言模型展现出前所未有的推理和生成能力，为开发更强大的对话推荐系统提供了新机遇。本工作中，我们提出了一种基于LLM的新型对话推荐系统，称为LLMCRS，以应对上述挑战。在子任务管理方面，我们利用LLM的推理能力来有效管理子任务。在子任务解决方面，我们让LLM与不同子任务的专家模型协作以实现更优性能。在回复生成方面，我们利用LLM的生成能力作为语言接口以更好地与用户交互。具体而言，LLMCRS将工作流程划分为四个阶段：子任务检测、模型匹配、子任务执行和回复生成。LLMCRS还设计了基于模式的指令、基于示例的指令、动态子任务与模型匹配，以及基于摘要的生成，以指导LLM在工作流程中生成期望结果。最后，为使LLM适应对话推荐场景，我们提出使用对话推荐系统性能反馈的强化学习对LLM进行微调，称为RLPF。在基准数据集上的实验结果表明，采用RLPF的LLMCRS性能优于现有方法。