Modern coding agents integrated into IDEs orchestrate powerful tools and high-privilege system access, creating a high-stakes attack surface. Prior work on Indirect Prompt Injection (IPI) is mainly query-specific, requiring particular user queries as triggers and leading to poor generalizability. We propose query-agnostic IPI, a new attack paradigm that reliably executes malicious payloads under arbitrary user queries. Our key insight is that malicious payloads should leverage the invariant prompt context (i.e., system prompt and tool descriptions) rather than variant user queries. We present QueryIPI, an automated framework that uses tool descriptions as optimizable payloads and refines them via iterative, prompt-based blackbox optimization. QueryIPI leverages system invariants for initial seed generation aligned with agent conventions, and iterative reflection to resolve instruction-following failures and safety refusals. Experiments on five simulated agents show that QueryIPI achieves up to 87% success rate, outperforming the best baseline (50%). Crucially, generated malicious descriptions transfer to real-world coding agents, highlighting a practical security risk.
翻译:集成到IDE中的现代编码代理能够协调强大的工具和高权限系统访问,从而构成了高风险攻击面。先前关于间接提示注入(IPI)的研究主要是查询特定的,需要特定的用户查询作为触发器,导致泛化能力较差。我们提出查询无关IPI这一新型攻击范式,能够在任意用户查询下可靠执行恶意负载。我们的核心洞见是:恶意负载应利用不变的提示上下文(即系统提示和工具描述),而非多变的用户查询。我们提出了QueryIPI——一个自动化框架,该框架将工具描述作为可优化的负载,并通过基于提示的迭代式黑盒优化进行精炼。QueryIPI利用系统不变性生成符合代理惯例的初始种子,并通过迭代反思来解决指令遵循失败和安全拒绝问题。在五个模拟代理上的实验表明,QueryIPI实现了高达87%的成功率,优于最佳基线方法(50%)。关键的是,生成的恶意描述能够迁移到真实世界的编码代理,这凸显了实际的安全风险。