On the Memorization Behavior of LLMs in Generative Recommendation: Observations, Implications, and Training Strategies

Generative recommendation (GR) has emerged as a promising direction for recommender systems. Recently, large language models (LLMs) have been increasingly adopted for GR, as their rich pretrained knowledge is expected to help them generalize beyond common user behavior patterns that traditional memorization-oriented baselines can capture. However, existing LLM-based GR works largely ignore LLMs' well-known tendency to memorize, which, if present in LLMs fine-tuned for GR, would restrict their utilization of pretrained knowledge. In this work, we investigate this concern by examining one-hop memorization, where a model recommends items that are direct successors of items in the training data. We show that LLMs do this more than non-LLM-based GR models-in fact, the vast majority of their gains over GR baselines are actually on users whose target items can be predicted through one-hop memorization. We intuit that improving performance on the remaining users requires LLMs to learn richer item-item relations beyond one-hop transitions. To achieve this, we propose IIRG, a novel training strategy that teaches LLMs to capture: (1) collaborative relations derived from item co-occurrences across multiple hops in user sequences, and (2) semantic relations among items with similar themes, both of which can serve as useful recommendation signals. We show that IIRG significantly improves over LLMs trained solely with standard next-item prediction, with especially large gains for users whose test items are not covered by train-time one-hop transitions.

翻译：生成式推荐（GR）已成为推荐系统的一个有前景的研究方向。近期，大型语言模型（LLMs）被越来越多地应用于GR，其丰富的预训练知识有望帮助模型泛化超越传统以记忆为导向的基线方法所能捕捉的常见用户行为模式。然而，现有的基于LLM的GR研究大多忽视了LLM众所周知的记忆倾向——若这种倾向存在于为GR微调的LLM中，将限制其预训练知识的利用。本研究通过考察"一跳记忆"（即模型推荐训练数据中项目直接后继项的行为）来探究该问题。我们发现，LLM比非LLM的GR模型表现出更强的此类记忆行为——事实上，相较于GR基线方法的性能提升，绝大部分增益实际上来源于那些可以通过一跳记忆预测目标项的用户。我们推断，要提升其余用户的推荐效果，需要LLM学习超越一跳迁移的丰富项目间关系。为此，我们提出IIRG这一新型训练策略，教导LLM捕获两类有用的推荐信号：（1）基于用户序列中多跳共现模式导出的协同关系；（2）主题相似项目间的语义关系。实验表明，相较于仅使用标准下一项预测训练的LLM，IIRG能显著提升性能，尤其对测试项目未在训练时的一跳迁移中出现的用户，增益更为突出。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

关于大语言模型驱动的推荐系统智能体的综述

专知会员服务

29+阅读 · 2025年2月17日

大规模语言模型增强推荐系统：分类、趋势、应用与未来

专知会员服务

40+阅读 · 2024年12月22日

大语言模型在序列推荐中的应用

专知会员服务

19+阅读 · 2024年11月12日

揭示生成式人工智能 / 大型语言模型（LLMs）的军事潜力

专知会员服务

32+阅读 · 2024年9月26日