Rethinking Large Language Model Architectures for Sequential Recommendations

Recently, sequential recommendation has been adapted to the LLM paradigm to enjoy the power of LLMs. LLM-based methods usually formulate recommendation information into natural language and the model is trained to predict the next item in an auto-regressive manner. Despite their notable success, the substantial computational overhead of inference poses a significant obstacle to their real-world applicability. In this work, we endeavor to streamline existing LLM-based recommendation models and propose a simple yet highly effective model Lite-LLM4Rec. The primary goal of Lite-LLM4Rec is to achieve efficient inference for the sequential recommendation task. Lite-LLM4Rec circumvents the beam search decoding by using a straight item projection head for ranking scores generation. This design stems from our empirical observation that beam search decoding is ultimately unnecessary for sequential recommendations. Additionally, Lite-LLM4Rec introduces a hierarchical LLM structure tailored to efficiently handle the extensive contextual information associated with items, thereby reducing computational overhead while enjoying the capabilities of LLMs. Experiments on three publicly available datasets corroborate the effectiveness of Lite-LLM4Rec in both performance and inference efficiency (notably 46.8% performance improvement and 97.28% efficiency improvement on ML-1m) over existing LLM-based methods. Our implementations will be open sourced.

翻译：近期，序列推荐领域已借助大语言模型范式来发挥其强大能力。基于LLM的方法通常将推荐信息转化为自然语言形式，并通过自回归方式预测下一物品。尽管这些方法取得了显著成功，但推理过程中巨大的计算开销严重阻碍了其实际应用。本研究致力于精简现有基于LLM的推荐模型，并提出一种简单而高效的模型Lite-LLM4Rec。Lite-LLM4Rec的核心目标是实现序列推荐任务的高效推理。该模型通过使用直接物品投影头生成排序分数，从而避免了束搜索解码过程——这一设计源于我们的实验观察：束搜索解码对于序列推荐任务本质上并非必要。此外，Lite-LLM4Rec引入了分层LLM架构，专门用于高效处理与物品相关的广泛上下文信息，在保持LLM能力的同时显著降低计算开销。在三组公开数据集上的实验证实，相较于现有基于LLM的方法，Lite-LLM4Rec在性能（在ML-1m数据集上提升46.8%）和推理效率（提升97.28%）方面均表现出显著优势。我们的实现将进行开源。

相关内容

大语言模型

关注 67

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日