The advent of large language models (LLMs) has ushered in a new paradigm of search engines that use generative models to gather and summarize information to answer user queries. This emerging technology, which we formalize under the unified framework of generative engines (GEs), can generate accurate and personalized responses, rapidly replacing traditional search engines like Google and Bing. Generative Engines typically satisfy queries by synthesizing information from multiple sources and summarizing them using LLMs. While this shift significantly improves \textit{user} utility and \textit{generative search engine} traffic, it poses a huge challenge for the third stakeholder - website and content creators. Given the black-box and fast-moving nature of generative engines, content creators have little to no control over \textit{when} and \textit{how} their content is displayed. With generative engines here to stay, we must ensure the creator economy is not disadvantaged. To address this, we introduce Generative Engine Optimization (GEO), the first novel paradigm to aid content creators in improving their content visibility in GE responses through a flexible black-box optimization framework for optimizing and defining visibility metrics. We facilitate systematic evaluation by introducing GEO-bench, a large-scale benchmark of diverse user queries across multiple domains, along with relevant web sources to answer these queries. Through rigorous evaluation, we demonstrate that GEO can boost visibility by up to 40\% in GE responses. Moreover, we show the efficacy of these strategies varies across domains, underscoring the need for domain-specific optimization methods. Our work opens a new frontier in information discovery systems, with profound implications for both developers of GEs and content creators.
翻译:大型语言模型(LLM)的出现催生了搜索引擎的新范式,这类引擎利用生成式模型收集并汇总信息以回答用户查询。我们将这一新兴技术形式化为生成式引擎(GE)的统一框架,它能够生成准确且个性化的响应,正在迅速取代Google和Bing等传统搜索引擎。生成式引擎通常通过综合多个来源的信息并利用LLM进行摘要来满足查询需求。尽管这一转变显著提升了\textit{用户}效用与\textit{生成式搜索引擎}流量,却对第三方利益相关者——网站与内容创作者——构成了巨大挑战。鉴于生成式引擎的黑盒特性与快速迭代本质,内容创作者对其内容\textit{何时}及\textit{如何}被展示几乎无法控制。随着生成式引擎的普及,我们必须确保创作者经济不受损害。为此,我们提出生成式引擎优化(GEO),这是首个通过灵活的黑盒优化框架来优化和定义可见性指标,以帮助内容创作者提升其内容在GE响应中可见性的新范式。我们通过构建GEO-bench基准测试集来促进系统化评估,该基准包含跨多个领域的多样化用户查询及回答这些查询所需的相关网络资源。通过严格评估,我们证明GEO能将内容在GE响应中的可见性提升高达40\%。此外,我们发现这些策略的效果在不同领域间存在差异,凸显了领域特定优化方法的必要性。我们的工作为信息发现系统开辟了新前沿,对GE开发者和内容创作者均具有深远意义。