A Wolf in Sheep's Clothing: Targeted Routing Hijacking in Federated RAG

Federated Retrieval-Augmented Generation (FedRAG) is attractive for privacy-sensitive applications because raw data remain local. As a result, routing must rely on client-provided semantic profiles, creating a new opportunity for manipulation. We introduce Routing Hijacking, a routing-stage attack in which a malicious client forges its profile to attract target queries despite having irrelevant underlying data. We show that this vulnerability is severe. Across three representative FedRAG routing architectures, Routing Hijacking consistently misroutes target queries and leads to downstream disruptions and failures, including missing evidence, poisoning, incorrect answers, and hallucinations. In a high-stakes MedQA-USMLE case study, we further show that poisoned retrieved evidence can mislead models across scales, leading to incorrect answers, hallucinations, and sycophantic failures. Existing defenses do not close this gap: encrypted routing preserves the exploited ranking, and Byzantine-robust Federated Learning (FL) rules transfer poorly to heterogeneous routing profiles. To address this gap, we propose a trust-aware post-routing framework that reweights clients using returned-evidence feedback, including retrieval relevance, profile consistency, and cross-client agreement; online experiments show that it suppresses persistent hijacking over recurring queries and transfers to a learned neural router. Our findings establish routing integrity as a new security challenge in FedRAG and highlight the need for stronger defenses for secure federated retrieval.

翻译：联邦检索增强生成（FedRAG）因其原始数据保留在本地，对隐私敏感的应用具有吸引力。因此，路由必须依赖于客户端提供的语义配置文件，这为操纵提供了新的机会。我们引入了路由劫持，这是一种路由阶段的攻击，恶意客户端伪造其配置文件以吸引目标查询，尽管其底层数据不相关。我们证明这种漏洞是严重的。在三种具有代表性的FedRAG路由架构中，路由劫持持续地将目标查询错误路由，并导致下游中断和失败，包括证据缺失、投毒、错误答案和幻觉。在高风险的MedQA-USMLE案例研究中，我们进一步证明，投毒的检索证据可以误导不同规模的模型，导致错误答案、幻觉和谄媚式失败。现有的防御措施无法消除这一差距：加密路由保留了被利用的排名，而拜占庭鲁棒联邦学习（FL）规则难以很好地迁移到异构路由配置文件中。为了解决这一差距，我们提出了一种信任感知的路由后框架，该框架利用返回证据反馈（包括检索相关性、配置文件一致性和跨客户端一致性）重新加权客户端；在线实验表明，它抑制了重复查询上的持续劫持，并可迁移到学习的神经路由器。我们的发现将路由完整性确立为FedRAG中的一个新的安全挑战，并强调了加强对安全联邦检索的防御的必要性。