Leveraging the vast open-world knowledge and understanding capabilities of Large Language Models (LLMs) to develop general-purpose, semantically-aware recommender systems has emerged as a pivotal research direction in generative recommendation. However, existing methods face bottlenecks in constructing item identifiers. Text-based methods introduce LLMs' vast output space, leading to hallucination, while methods based on Semantic IDs (SIDs) encounter a semantic gap between SIDs and LLMs' native vocabulary, requiring costly vocabulary expansion and alignment training. To address this, this paper introduces Term IDs (TIDs), defined as a set of semantically rich and standardized textual keywords, to serve as robust item identifiers. We propose GRLM, a novel framework centered on TIDs, employs Context-aware Term Generation to convert item's metadata into standardized TIDs and utilizes Integrative Instruction Fine-tuning to collaboratively optimize term internalization and sequential recommendation. Additionally, Elastic Identifier Grounding is designed for robust item mapping. Extensive experiments on real-world datasets demonstrate that GRLM significantly outperforms baselines across multiple scenarios, pointing a promising direction for generalizable and high-performance generative recommendation systems.
翻译:利用大型语言模型(LLM)丰富的开放世界知识与理解能力,开发通用、语义感知的推荐系统,已成为生成式推荐领域的关键研究方向。然而,现有方法在构建物品标识符方面面临瓶颈。基于文本的方法引入了LLM庞大的输出空间,易导致幻觉问题;而基于语义标识符(SID)的方法则面临SID与LLM原生词汇表之间的语义鸿沟,需要昂贵的词汇表扩展与对齐训练。为解决这些问题,本文引入术语标识符(TID),将其定义为一组语义丰富且标准化的文本关键词,以作为鲁棒的物品标识符。我们提出了GRLM这一以TID为核心的新型框架,该框架采用上下文感知术语生成将物品元数据转换为标准化TID,并利用集成指令微调协同优化术语内化与序列推荐。此外,设计了弹性标识符落地机制以实现鲁棒的物品映射。在真实世界数据集上的大量实验表明,GRLM在多种场景下均显著优于基线方法,为可泛化、高性能的生成式推荐系统指明了有前景的方向。