As LLMs increasingly power agents that interact with external tools, tool use has become an essential mechanism for extending their capabilities. These agents typically select tools from growing databases or marketplaces to solve user tasks, creating implicit competition among tool providers and developers for visibility and usage. In this paper, we show that this selection process harbors a critical vulnerability: by iteratively manipulating tool names and descriptions, adversaries can systematically bias agents toward selecting specific tools, gaining unfair advantage over equally capable alternatives. We present ToolTweak, a lightweight automatic attack that increases selection rates from a baseline of around 20% to as high as 81%, with strong transferability between open-source and closed-source models. Beyond individual tools, we show that such attacks cause distributional shifts in tool usage, revealing risks to fairness, competition, and security in emerging tool ecosystems. To mitigate these risks, we evaluate two defenses: paraphrasing and perplexity filtering, which reduce bias and lead agents to select functionally similar tools more equally. All code will be open-sourced upon acceptance.
翻译:随着大语言模型日益驱动着与外部工具交互的智能体,工具使用已成为扩展其能力的关键机制。这些智能体通常从不断增长的数据库或市场中选取工具以解决用户任务,这导致工具提供者和开发者之间在可见性与使用率上形成了隐性竞争。本文揭示了该选择过程中存在一个关键漏洞:通过迭代式操纵工具名称与描述,攻击者能够系统性地诱导智能体偏向选择特定工具,从而在与能力相当的其他工具竞争中获取不公平优势。我们提出了ToolTweak——一种轻量级自动攻击方法,可将工具被选率从约20%的基线提升至最高81%,并在开源与闭源模型间展现出强大的迁移性。除针对单个工具外,我们证明此类攻击会导致工具使用分布的偏移,揭示了新兴工具生态系统中在公平性、竞争性与安全性方面存在的风险。为缓解这些风险,我们评估了两种防御方案:释义重写与困惑度过滤,它们能有效降低选择偏差,促使智能体更公平地选择功能相近的工具。所有代码将在论文录用后开源。