As the focus on Large Language Models (LLMs) in the field of recommendation intensifies, the optimization of LLMs for recommendation purposes (referred to as LLM4Rec) assumes a crucial role in augmenting their effectiveness in providing recommendations. However, existing approaches for LLM4Rec often assess performance using restricted sets of candidates, which may not accurately reflect the models' overall ranking capabilities. In this paper, our objective is to investigate the comprehensive ranking capacity of LLMs and propose a two-step grounding framework known as BIGRec (Bi-step Grounding Paradigm for Recommendation). It initially grounds LLMs to the recommendation space by fine-tuning them to generate meaningful tokens for items and subsequently identifies appropriate actual items that correspond to the generated tokens. By conducting extensive experiments on two datasets, we substantiate the superior performance, capacity for handling few-shot scenarios, and versatility across multiple domains exhibited by BIGRec. Furthermore, we observe that the marginal benefits derived from increasing the quantity of training samples are modest for BIGRec, implying that LLMs possess the limited capability to assimilate statistical information, such as popularity and collaborative filtering, due to their robust semantic priors. These findings also underline the efficacy of integrating diverse statistical information into the LLM4Rec framework, thereby pointing towards a potential avenue for future research. Our code and data are available at https://github.com/SAI990323/Grounding4Rec.
翻译:随着大语言模型在推荐领域关注度的提升,面向推荐任务优化大语言模型的技术(简称LLM4Rec)在增强推荐效果方面发挥着关键作用。然而,现有LLM4Rec方法常通过限定候选集来评估性能,这难以准确反映模型的整体排序能力。本文旨在探究大语言模型的全面排序能力,并提出名为BIGRec(面向推荐的双步落地范式)的两阶段落地框架。该框架首先通过微调使大语言模型生成具有语义的项目标记,将其锚定至推荐空间,随后识别与生成标记匹配的实际候选项目。通过在两个数据集上的大量实验,我们验证了BIGRec在少样本场景下的优越性能、跨领域适应性及处理能力。进一步研究发现,增加训练样本数量对BIGRec的边际收益有限,这说明大语言模型因具备强语义先验而难以有效吸收流行度、协同过滤等统计信息。这些发现同时揭示了将多样化统计信息融入LLM4Rec框架的可行性,为未来研究指明了潜在方向。我们的代码与数据已开源至https://github.com/SAI990323/Grounding4Rec。