Tool-using LLM agents increasingly coordinate real workloads by selecting and chaining third-party tools based on text-visible metadata such as tool names, descriptions, and return messages. We show that this convenience creates a supply-chain attack surface: a malicious MCP tool server can be co-registered alongside normal tools and induce overthinking loops, where individually trivial or plausible tool calls compose into cyclic trajectories that inflate end-to-end tokens and latency without any single step looking abnormal. We formalize this as a structural overthinking attack, distinguishable from token-level verbosity, and implement 14 malicious tools across three servers that trigger repetition, forced refinement, and distraction. Across heterogeneous registries and multiple tool-capable models, the attack causes severe resource amplification (up to $142.4\times$ tokens) and can degrade task outcomes. Finally, we find that decoding-time concision controls do not reliably prevent loop induction, suggesting defenses should reason about tool-call structure rather than tokens alone.
翻译:使用工具的LLM智能体正日益通过基于文本可见的元数据(如工具名称、描述和返回消息)来选择和链接第三方工具,以协调实际工作负载。我们证明,这种便利性会创建一个供应链攻击面:恶意的MCP工具服务器可以与正常工具一同注册,并诱发过度思考循环。在这种循环中,单个看似微不足道或合理的工具调用会组合成循环轨迹,从而在没有单个步骤显得异常的情况下,推高端到端的令牌消耗和延迟。我们将此形式化为一种结构性过度思考攻击,以区别于令牌层面的冗余冗长,并在三个服务器上实现了14个恶意工具,这些工具会触发重复、强制细化和注意力分散。在异构注册中心和多个支持工具的模型上,该攻击会导致严重的资源放大(令牌消耗最高可达$142.4\times$),并可能降低任务完成质量。最后,我们发现解码时的简洁性控制并不能可靠地防止循环诱导,这表明防御措施应当对工具调用结构进行推理,而不仅仅关注令牌。