Large Language Models (LLMs) have demonstrated remarkable performance across various domains, motivating researchers to investigate their potential use in recommendation systems. However, directly applying LLMs to recommendation tasks has proven challenging due to the significant disparity between the data used for pre-training LLMs and the specific requirements of recommendation tasks. In this study, we introduce Direct Multi-Preference Optimization (DMPO), a streamlined framework designed to bridge the gap and enhance the alignment of LLMs for recommendation tasks. DMPO enhances the performance of LLM-based recommenders by simultaneously maximizing the probability of positive samples and minimizing the probability of multiple negative samples. We conducted experimental evaluations to compare DMPO against traditional recommendation methods and other LLM-based recommendation approaches. The results demonstrate that DMPO significantly improves the recommendation capabilities of LLMs across three real-world public datasets in few-shot scenarios. Additionally, the experiments indicate that DMPO exhibits superior generalization ability in cross-domain recommendations. A case study elucidates the reasons behind these consistent improvements and also underscores DMPO's potential as an explainable recommendation system.
翻译:大型语言模型(LLMs)已在多个领域展现出卓越的性能,这促使研究者探索其在推荐系统中的潜在应用。然而,由于LLMs预训练所用数据与推荐任务特定需求之间存在显著差异,直接将LLMs应用于推荐任务已被证明具有挑战性。本研究提出了直接多偏好优化(DMPO),这是一个旨在弥合上述差距并增强LLMs与推荐任务对齐的简化框架。DMPO通过同时最大化正样本概率并最小化多个负样本概率,提升了基于LLM的推荐系统性能。我们进行了实验评估,将DMPO与传统推荐方法及其他基于LLM的推荐方法进行比较。结果表明,在少样本场景下,DMPO在三个真实世界公开数据集上显著提升了LLMs的推荐能力。此外,实验表明DMPO在跨域推荐中展现出更优的泛化能力。一项案例分析阐明了这些持续改进背后的原因,同时也凸显了DMPO作为可解释推荐系统的潜力。