CmdNeedle: Measuring the Incompleteness of Command Denylists for AI Agents

The adoption of AI agents is increasing rapidly. Terminal AI agents, i.e., AI agents that run in terminal environments, are a widely used type of AI agents. Terminal AI agents rely heavily on shell command execution to interact with the host systems. They adopt a three-list command-gating mechanism to mitigate security risks introduced by command execution, with denylists serving as the load-bearing component. However, modern operating systems often ship a large, ever-expanding set of shell commands with complex functionalities. Our observation is that even a built-in denylist of Claude Code, well-maintained by its developers, can overlook bypass commands that invalidate its effectiveness. Such negligence leads to fragile command denylists that cannot even block operations that practitioners expect them to block. This paper presents the first systematic characterization of command denylist fragility in terminal AI agents. The paper formalizes the command denylist fragility problem and proposes an LLM-driven pipeline, CmdNeedle, to detect such fragility. It prompts the LLM to propose possible bypasses and iteratively repairs them using feedback from a validator that executes them in a sandbox. In the evaluation, we applied CmdNeedle to 1,709 real-world command denylists (containing 13,332 denylist rules) collected from GitHub. The evaluation shows several key findings, including that 69.0--98.6% of the denylists are fragile, that this fragility occurs consistently across projects and agents, and the validity of several possible root causes for this fragility. Our pipeline and findings will hopefully facilitate future research and practice regarding the command denylists used by AI agents.

翻译：AI代理的采用正在迅速增长。终端AI代理，即在终端环境中运行的AI代理，是一种广泛使用的AI代理类型。终端AI代理高度依赖shell命令执行来与主机系统交互。它们采用三列表命令门控机制来缓解命令执行带来的安全风险，其中否决列表作为承载组件。然而，现代操作系统通常附带大量、不断扩展且功能复杂的shell命令。我们的观察是，即使是Claude Code的内置否决列表——由其开发者良好维护——也可能忽略导致其失效的绕过命令。这种疏忽导致命令否决列表脆弱，甚至无法阻止从业者期望其阻止的操作。本文首次系统性地描述了终端AI代理中命令否决列表的脆弱性。论文形式化了命令否决列表的脆弱性问题，并提出了一种由LLM驱动的流水线CmdNeedle来检测此类脆弱性。它提示LLM提出可能的绕过方法，并利用验证器在沙箱中执行这些绕过方法的反馈进行迭代修复。在评估中，我们将CmdNeedle应用于从GitHub收集的1,709个真实世界命令否决列表（包含13,332条否决规则）。评估展示了若干关键发现，包括69.0–98.6%的否决列表存在脆弱性，该脆弱性在项目和代理间一致出现，以及该脆弱性的几个可能根本原因的有效性。我们的流水线和发现有望促进未来关于AI代理使用的命令否决列表的研究和实践。

相关内容

关注 7111

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

代码即代理基础设施：迈向可执行、可验证、有状态的AI代理系统

专知会员服务

18+阅读 · 5月20日

【博士论文】已对齐人工智能系统的持久脆弱性

专知会员服务

12+阅读 · 4月15日

构建面向终端的 AI 编程智能体：脚手架、测试环境、上下文工程及实践经验

专知会员服务

26+阅读 · 3月8日

专业软件开发者不靠“氛围编程”（Vibe Coding），而靠“控制”：2025 年 AI Agent 在编程中的应用研究

专知会员服务

22+阅读 · 2025年12月31日