Token-level Collaborative Alignment for LLM-based Generative Recommendation

Large Language Models (LLMs) have demonstrated strong potential for generative recommendation by leveraging rich semantic knowledge. However, existing LLM-based recommender systems struggle to effectively incorporate collaborative filtering (CF) signals, due to a fundamental mismatch between item-level preference modeling in CF and token-level next-token prediction (NTP) optimization in LLMs. Prior approaches typically treat CF as contextual hints or representation bias, and resort to multi-stage training to reduce behavioral semantic space discrepancies, leaving CF unable to explicitly regulate LLM generation. In this work, we propose Token-level Collaborative Alignment for Recommendation (TCA4Rec), a model-agnostic and plug-and-play framework that establishes an explicit optimization-level interface between CF supervision and LLM generation. TCA4Rec consists of (i) Collaborative Tokenizer, which projects raw item-level CF logits into token-level distributions aligned with the LLM token space, and (ii) Soft Label Alignment, which integrates these CF-informed distributions with one-hot supervision to optimize a soft NTP objective. This design preserves the generative nature of LLM training while enabling collaborative alignment with essential user preference of CF models. We highlight TCA4Rec is compatible with arbitrary traditional CF models and generalizes across a wide range of decoder-based LLM recommender architectures. Moreover, it provides an explicit mechanism to balance behavioral alignment and semantic fluency, yielding generative recommendations that are both accurate and controllable. Extensive experiments demonstrate that TCA4Rec consistently improves recommendation performance across a broad spectrum of CF models and LLM-based recommender systems.

翻译：大型语言模型（LLM）通过利用丰富的语义知识，在生成式推荐任务中展现出巨大潜力。然而，现有的基于LLM的推荐系统难以有效融合协同过滤（CF）信号，其根本原因在于CF中的物品级偏好建模与LLM中基于令牌级的下一个令牌预测（NTP）优化之间存在本质性错配。先前的方法通常将CF视为上下文提示或表示偏置，并依赖多阶段训练来减少行为语义空间的差异，导致CF无法显式地调控LLM的生成过程。本文提出一种用于推荐的令牌级协同对齐框架（TCA4Rec），该框架是一个模型无关、即插即用的系统，在CF监督与LLM生成之间建立了一个显式的优化级接口。TCA4Rec包含两个核心组件：（i）协同分词器，将原始的物品级CF对数概率投影至与LLM令牌空间对齐的令牌级分布；（ii）软标签对齐，将这些CF引导的分布与独热编码监督相结合，以优化一个软化的NTP目标。该设计在保持LLM训练生成性质的同时，实现了与CF模型所捕获的本质用户偏好之间的协同对齐。我们强调TCA4Rec兼容任意传统CF模型，并可泛化至多种基于解码器的LLM推荐架构。此外，它提供了一种显式机制来平衡行为对齐与语义流畅性，从而生成既准确又可控制的推荐结果。大量实验表明，TCA4Rec在广泛的CF模型和基于LLM的推荐系统中均能持续提升推荐性能。