The advent of large language models (LLMs) has ushered in a new paradigm of search engines that use generative models to gather and summarize information to answer user queries. This emerging technology, which we formalize under the unified framework of generative engines (GEs), can generate accurate and personalized responses, rapidly replacing traditional search engines like Google and Bing. Generative Engines typically satisfy queries by synthesizing information from multiple sources and summarizing them using LLMs. While this shift significantly improves $\textit{user}$ utility and $\textit{generative search engine}$ traffic, it poses a huge challenge for the third stakeholder -- website and content creators. Given the black-box and fast-moving nature of generative engines, content creators have little to no control over $\textit{when}$ and $\textit{how}$ their content is displayed. With generative engines here to stay, we must ensure the creator economy is not disadvantaged. To address this, we introduce Generative Engine Optimization (GEO), the first novel paradigm to aid content creators in improving their content visibility in generative engine responses through a flexible black-box optimization framework for optimizing and defining visibility metrics. We facilitate systematic evaluation by introducing GEO-bench, a large-scale benchmark of diverse user queries across multiple domains, along with relevant web sources to answer these queries. Through rigorous evaluation, we demonstrate that GEO can boost visibility by up to $40\%$ in generative engine responses. Moreover, we show the efficacy of these strategies varies across domains, underscoring the need for domain-specific optimization methods. Our work opens a new frontier in information discovery systems, with profound implications for both developers of generative engines and content creators.
翻译:大型语言模型(LLM)的出现催生了搜索引擎的新范式,即利用生成模型收集并汇总信息以回答用户查询。这一新兴技术——我们将其统一形式化为生成式引擎(GE)框架——能够生成准确且个性化的响应,正迅速取代如Google和Bing等传统搜索引擎。生成式引擎通常通过综合多源信息并利用LLM进行摘要来满足查询需求。尽管这一转变显著提升了$\textit{用户}$效用和$\textit{生成式搜索引擎}$流量,却对第三方利益相关者——网站与内容创作者——构成了巨大挑战。鉴于生成式引擎的黑盒特性与快速演进本质,内容创作者对其内容$\textit{何时}$及$\textit{如何}$被展示几乎无法控制。随着生成式引擎的持续存在,我们必须确保创作者经济不受损害。为此,我们提出生成式引擎优化(GEO),这是首个通过灵活的黑盒优化框架来优化和定义可见性指标,以帮助内容创作者提升其内容在生成式引擎响应中可见性的新范式。我们通过引入GEO-bench促进系统化评估,该大规模基准涵盖多领域多样化用户查询及回答这些查询的相关网络资源。经严格评估,我们证明GEO可将生成式引擎响应中的内容可见性提升高达$40\%$。此外,我们发现这些策略的效果在不同领域间存在差异,凸显了领域特异性优化方法的必要性。本研究为信息发现系统开辟了新前沿,对生成式引擎开发者与内容创作者均具有深远意义。