C-SEO Bench：对话式搜索引擎优化是否有效？ (C-SEO Bench: Does Conversational SEO Work?)

Large Language Models (LLMs) are transforming search engines into Conversational Search Engines (CSE). Consequently, Search Engine Optimization (SEO) is being shifted into Conversational Search Engine Optimization (C-SEO). We are beginning to see dedicated C-SEO methods for modifying web documents to increase their visibility in CSE responses. However, they are often tested only for a limited breadth of application domains; we do not know whether certain C-SEO methods would be effective for a broad range of domains. Moreover, existing evaluations consider only a single-actor scenario where only one web document adopts a C-SEO method; in reality, multiple players are likely to competitively adopt the cutting-edge C-SEO techniques, drawing an analogy from the dynamics we have seen in SEO. We present C-SEO Bench, the first benchmark designed to evaluate C-SEO methods across multiple tasks, domains, and number of actors. We consider two search tasks, question answering and product recommendation, with three domains each. We also formalize a new evaluation protocol with varying adoption rates among involved actors. Our experiments reveal that most current C-SEO methods are not only largely ineffective but also frequently have a negative impact on document ranking, which is opposite to what is expected. Instead, traditional SEO strategies, those aiming to improve the ranking of the source in the LLM context, are significantly more effective. We also observe that as we increase the number of C-SEO adopters, the overall gains decrease, depicting a congested and zero-sum nature of the problem. Our code and data are available at https://github.com/parameterlab/c-seo-bench and https://huggingface.co/datasets/parameterlab/c-seo-bench.

翻译：大型语言模型（LLM）正在将搜索引擎转变为对话式搜索引擎（CSE）。因此，搜索引擎优化（SEO）正转向对话式搜索引擎优化（C-SEO）。我们开始看到专门用于修改网络文档以提升其在CSE响应中可见性的C-SEO方法。然而，这些方法通常仅在有限的应用领域范围内进行测试；我们尚不清楚某些C-SEO方法是否能在广泛的领域中有效。此外，现有评估仅考虑单一参与者场景，即仅有一个网络文档采用C-SEO方法；而现实中，借鉴我们在SEO中观察到的动态，多个参与者很可能竞争性地采用前沿的C-SEO技术。我们提出了C-SEO Bench，这是首个设计用于跨多个任务、领域和参与者数量评估C-SEO方法的基准。我们考虑了两个搜索任务——问答和产品推荐，每个任务包含三个领域。我们还形式化了一种新的评估协议，其中涉及参与者的采用率各不相同。实验结果表明，当前大多数C-SEO方法不仅基本无效，而且常常对文档排名产生负面影响，这与预期效果相反。相反，传统的SEO策略——即旨在提升源文档在LLM上下文中的排名——则显著更为有效。我们还观察到，随着采用C-SEO的参与者数量增加，整体收益下降，这揭示了该问题的拥堵性和零和本质。我们的代码和数据可在 https://github.com/parameterlab/c-seo-bench 和 https://huggingface.co/datasets/parameterlab/c-seo-bench 获取。