The emergence of Large Language Model-enhanced Search Engines (LLMSEs) has revolutionized information retrieval by integrating web-scale search capabilities with AI-powered summarization. While these systems demonstrate improved efficiency over traditional search engines, their security implications against well-established black-hat Search Engine Optimization (SEO) attacks remain unexplored. In this paper, we present the first systematic study of SEO attacks targeting LLMSEs. Specifically, we examine ten representative LLMSE products (e.g., ChatGPT, Gemini) and construct SEO-Bench, a benchmark comprising 1,000 real-world black-hat SEO websites, to evaluate both open- and closed-source LLMSEs. Our measurements show that LLMSEs mitigate over 99.78% of traditional SEO attacks, with the phase of retrieval serving as the primary filter, intercepting the vast majority of malicious queries. We further propose and evaluate seven LLMSEO attack strategies, demonstrating that off-the-shelf LLMSEs are vulnerable to LLMSEO attacks, i.e., rewritten-query stuffing and segmented texts double the manipulation rate compared to the baseline. This work offers the first in-depth security analysis of the LLMSE ecosystem, providing practical insights for building more resilient AI-driven search systems. We have responsibly reported the identified issues to major vendors.
翻译:大语言模型增强搜索引擎(LLMSEs)的出现,通过将网络级搜索能力与AI驱动的摘要相结合,彻底改变了信息检索方式。尽管这些系统相较于传统搜索引擎展现出更高的效率,但其在面对已长期存在的黑帽搜索引擎优化(SEO)攻击时的安全影响仍未得到探索。本文首次系统性地研究了针对LLMSEs的SEO攻击。具体而言,我们考察了十种代表性LLMSE产品(如ChatGPT、Gemini),并构建了SEO-Bench基准测试集,该基准包含1000个真实世界的黑帽SEO网站,用于评估开源与闭源LLMSEs。我们的测量结果显示,LLMSEs能缓解超过99.78%的传统SEO攻击,其中检索阶段作为主要过滤器,拦截了绝大多数恶意查询。我们进一步提出并评估了七种LLMSEO攻击策略,证明现成的LLMSEs易受LLMSEO攻击,即改写查询填充与分段文本的操纵率相较于基线翻倍。本研究首次对LLMSE生态系统进行了深入的安全分析,为构建更具韧性的AI驱动搜索系统提供了实用见解。我们已负责任地将所发现的问题报告给主要供应商。