LLMs are increasingly applied to recommendation, retrieval, and reasoning, yet deploying a single end-to-end model that can jointly support these behaviors over large, heterogeneous catalogs remains challenging. Such systems must generate unambiguous references to real items, handle multiple entity types, and operate under strict latency and reliability constraints requirements that are difficult to satisfy with text-only generation. While tool-augmented recommender systems address parts of this problem, they introduce orchestration complexity and limit end-to-end optimization. We view this setting as an instance of a broader research problem: how to adapt LLMs to reason jointly over multiple-domain entities, users, and language in a fully self-contained manner. To this end, we introduce NEO, a framework that adapts a pre-trained decoder-only LLM into a tool-free, catalog-grounded generator. NEO represents items as SIDs and trains a single model to interleave natural language and typed item identifiers within a shared sequence. Text prompts control the task, target entity type, and output format (IDs, text, or mixed), while constrained decoding guarantees catalog-valid item generation without restricting free-form text. We refer to this instruction-conditioned controllability as language-steerability. We treat SIDs as a distinct modality and study design choices for integrating discrete entity representations into LLMs via staged alignment and instruction tuning. We evaluate NEO at scale on a real-world catalog of over 10M items across multiple media types and discovery tasks, including recommendation, search, and user understanding. In offline experiments, NEO consistently outperforms strong task-specific baselines and exhibits cross-task transfer, demonstrating a practical path toward consolidating large-scale discovery capabilities into a single language-steerable generative model.
翻译:大型语言模型(LLM)日益应用于推荐、检索和推理,但部署一个能够联合支持这些行为、覆盖大规模异构目录的端到端单一模型仍具挑战性。此类系统必须生成对真实物品的无歧义引用,处理多种实体类型,并在严格的延迟和可靠性约束下运行——这些要求难以通过纯文本生成来满足。尽管增强工具的推荐系统解决了部分问题,但它们引入了编排复杂性并限制了端到端优化。我们将该场景视为一个更广泛研究问题的实例:如何使LLM以完全自包含的方式对跨领域实体、用户和语言进行联合推理。为此,我们提出NEO框架,将预训练的仅解码器LLM适配为无需工具、基于目录的生成器。NEO将物品表示为SID(序列标识符),并训练单一模型在共享序列中交错生成自然语言和类型化的物品标识符。文本提示控制任务、目标实体类型和输出格式(标识符、文本或混合形式),而约束解码在保证目录有效物品生成的同时不限制自由文本。我们将这种基于指令的可控性称为语言导向性。我们将SID视为独立模态,研究通过分阶段对齐和指令微调将离散实体表示集成到LLM中的设计方案。我们在包含超1000万物品、涵盖多种媒体类型及发现任务(包括推荐、搜索和用户理解)的实际目录上对NEO进行了规模化评估。离线实验中,NEO持续优于强任务特定基线,并展现出跨任务迁移能力,为将大规模发现能力整合到单一语言导向生成模型中提供了可行路径。