Dense embedding-based text retrieval$\unicode{x2013}$retrieval of relevant passages from corpora via deep learning encodings$\unicode{x2013}$has emerged as a powerful method attaining state-of-the-art search results and popularizing the use of Retrieval Augmented Generation (RAG). Still, like other search methods, embedding-based retrieval may be susceptible to search-engine optimization (SEO) attacks, where adversaries promote malicious content by introducing adversarial passages to corpora. To faithfully assess and gain insights into the susceptibility of such systems to SEO, this work proposes the GASLITE attack, a mathematically principled gradient-based search method for generating adversarial passages without relying on the corpus content or modifying the model. Notably, GASLITE's passages (1) carry adversary-chosen information while (2) achieving high retrieval ranking for a selected query distribution when inserted to corpora. We use GASLITE to extensively evaluate retrievers' robustness, testing nine advanced models under varied threat models, while focusing on realistic adversaries targeting queries on a specific concept (e.g., a public figure). We found GASLITE consistently outperformed baselines by $\geq$140% success rate, in all settings. Particularly, adversaries using GASLITE require minimal effort to manipulate search results$\unicode{x2013}$by injecting a negligible amount of adversarial passages ($\leq$0.0001% of the corpus), they could make them visible in the top-10 results for 61-100% of unseen concept-specific queries against most evaluated models. Inspecting variance in retrievers' robustness, we identify key factors that may contribute to models' susceptibility to SEO, including specific properties in the embedding space's geometry.
翻译:基于稠密嵌入的文本检索——即通过深度学习编码从语料库中检索相关段落——已成为一种强大的方法,能够获得最先进的搜索结果,并推动了检索增强生成(RAG)的广泛应用。然而,与其他检索方法类似,基于嵌入的检索可能容易受到搜索引擎优化(SEO)攻击,攻击者通过向语料库中注入对抗性段落来推广恶意内容。为了准确评估此类系统对SEO攻击的脆弱性并深入理解其机理,本研究提出了GASLITE攻击,这是一种基于数学原理的梯度搜索方法,用于生成对抗性段落,且不依赖于语料库内容或修改模型。值得注意的是,GASLITE生成的段落具有以下特点:(1)携带攻击者指定的信息,同时(2)在注入语料库后,能够针对选定的查询分布获得高检索排名。我们使用GASLITE对检索模型的鲁棒性进行了广泛评估,在多种威胁模型下测试了九种先进模型,重点关注针对特定概念(如公众人物)查询的真实攻击场景。实验发现,在所有设置中,GASLITE的成功率始终比基线方法高出≥140%。特别值得注意的是,攻击者使用GASLITE仅需极小代价即可操纵搜索结果——通过注入极少量的对抗性段落(≤语料库的0.0001%),就能使这些段落在大多数受测模型中对61-100%的未见概念相关查询出现在前10条结果中。通过分析不同检索模型鲁棒性的差异,我们识别了可能导致模型易受SEO攻击的关键因素,包括嵌入空间几何结构的特定属性。