Conversational recommendation systems (CRS) leverage contextual information from conversations to generate recommendations but often struggle due to a lack of collaborative filtering (CF) signals, which capture user-item interaction patterns essential for accurate recommendations. We introduce Reddit-ML32M, a dataset that links reddit conversations with interactions on MovieLens 32M, to enrich item representations by leveraging collaborative knowledge and addressing interaction sparsity in conversational datasets. We propose an LLM-based framework that uses Reddit-ML32M to align LLM-generated recommendations with CF embeddings, refining rankings for better performance. We evaluate our framework against three sets of baselines: CF-based recommenders using only interactions from CRS tasks, traditional CRS models, and LLM-based methods relying on conversational context without item representations. Our approach achieves consistent improvements, including a 12.32% increase in Hit Rate and a 9.9% improvement in NDCG, outperforming the best-performing baseline that relies on conversational context but lacks collaborative item representations.
翻译:对话式推荐系统(CRS)利用对话中的上下文信息生成推荐,但由于缺乏协同过滤(CF)信号——这些信号对捕获用户-物品交互模式以实现准确推荐至关重要——其性能常受限制。我们引入了Reddit-ML32M数据集,该数据集将Reddit对话与MovieLens 32M上的交互行为关联起来,旨在通过利用协同知识并缓解对话数据集中交互稀疏性问题,以丰富物品表征。我们提出了一个基于大语言模型(LLM)的框架,该框架利用Reddit-ML32M将LLM生成的推荐与CF嵌入对齐,从而优化排序以提升性能。我们通过三组基线模型评估了所提框架:仅使用CRS任务中交互数据的基于CF的推荐模型、传统CRS模型,以及依赖对话上下文但缺乏物品表征的基于LLM的方法。我们的方法取得了持续的性能提升,包括命中率(Hit Rate)提高12.32%,归一化折损累计增益(NDCG)提升9.9%,其表现优于依赖对话上下文但缺乏协同物品表征的最佳基线模型。