Modern search engines are built on a stack of different components, including query understanding, retrieval, multi-stage ranking, and question answering, among others. These components are often optimized and deployed independently. In this paper, we introduce a novel conceptual framework called large search model, which redefines the conventional search stack by unifying search tasks with one large language model (LLM). All tasks are formulated as autoregressive text generation problems, allowing for the customization of tasks through the use of natural language prompts. This proposed framework capitalizes on the strong language understanding and reasoning capabilities of LLMs, offering the potential to enhance search result quality while simultaneously simplifying the existing cumbersome search stack. To substantiate the feasibility of this framework, we present a series of proof-of-concept experiments and discuss the potential challenges associated with implementing this approach within real-world search systems.
翻译:现代搜索引擎构建于包括查询理解、检索、多阶段排序和问答等多个组件构成的堆栈之上,这些组件通常独立优化和部署。本文提出了一种名为"大型搜索模型"的创新概念框架,通过统一使用一个大语言模型(LLM)来重新定义传统搜索堆栈。所有任务均被建模为自回归文本生成问题,并可通过自然语言提示实现任务的定制化。该框架充分利用了LLM强大的语言理解和推理能力,在提升搜索结果质量的同时简化了现有的复杂搜索堆栈。为验证该框架的可行性,我们展示了一系列概念验证实验,并讨论了在实际搜索系统中实施该方法的潜在挑战。