Agent Control Protocol: Admission Control for Agent Actions

from arxiv, v1.23: deviation collapse (Exp 9), BAR metric, counterfactual evaluation, Failure Condition Preservation, ACP-RISK-3.0 in Technical Mechanisms, Related Work extended. v1.22: stateless vs stateful (500/500 vs 2/500), state-mixing (Exp 7) and ACP-RISK-3.0 fix (Exp 8). v1.21: TLA+ extended (9 inv, 4 temp, 5.6M states), token replay, PQ hybrid signing. v1.20: adversarial evaluation and performance

Autonomous agents can produce harmful behavioral patterns from individually valid requests. This class of threat cannot be addressed by per-request policy evaluation, because stateless engines evaluate each request in isolation and cannot enforce properties that depend on execution history. We present ACP, a temporal admission control protocol that enforces behavioral properties over execution traces by combining static risk scoring with stateful signals (anomaly accumulation, cooldown) through a LedgerQuerier abstraction that separates decision logic from state management. Under a 500-request workload where every request is individually valid (RS=35), a stateless engine approves all 500 requests. ACP limits autonomous execution to 2 out of 500 (0.4%), escalating after 3 actions and enforcing denial after 11. We identify a bounded state-mixing vulnerability where agent-level anomaly aggregation inadvertently elevates risk across unrelated contexts. ACP-RISK-3.0 resolves this by scoping temporal signals to (agentID, capability, resource), preserving enforcement within each context. We further identify deviation collapse: a degenerate regime in which enforcement is active but never exercised because upstream constraints eliminate the inputs required for DENIED decisions. We introduce Boundary Activation Rate (BAR) as a metric and counterfactual evaluation as a detection mechanism (Experiment 9: BAR drops from 0.70 to 0.00 under sanitization, restored to 1.00 via counterfactual injection). Decision latency: 767-921 ns (p50); throughput: 920,000 req/s. Safety and liveness model-checked via TLA+ (9 invariants, 4 temporal properties, 0 violations across 5,684,342 states), validated by 73 signed conformance vectors. Specification and implementation: https://github.com/chelof100/acp-framework-en

翻译：自主智能体在处理单个合法请求时可能产生有害行为模式。此类威胁无法通过逐请求策略评估解决，因为无状态引擎孤立评估每个请求，无法实施依赖执行历史的属性。我们提出ACP——一种时序准入控制协议，通过将静态风险评分与有状态信号（异常累积、冷却）相结合，经由将决策逻辑与状态管理分离的LedgerQuerier抽象，对执行轨迹实施行为属性。在500个请求的工作负载中，当每个请求均单独合法（RS=35）时，无状态引擎批准全部500个请求。而ACP将自主执行限制在500个请求中的2个（0.4%），在3次动作后触发升级，并在11次动作后强制执行拒绝。我们识别出一种有界状态混合漏洞：智能体级别的异常聚合会无意中提升不相关上下文的风险。ACP-RISK-3.0通过将时序信号限定至（智能体ID、能力、资源）范围，在各自上下文内保持执行解决了此问题。我们进一步发现偏差坍缩现象：一种执行机制始终激活但从未实际生效的退化状态，因为上游约束消除了拒绝决策所需的输入。我们引入边界激活率（BAR）作为度量指标，并采用反事实评估作为检测机制（实验9：BAR在净化处理下从0.70降至0.00，通过反事实注入恢复至1.00）。决策延迟：767-921纳秒（p50）；吞吐量：920,000请求/秒。通过TLA+进行安全性与活性模型检验（9个不变式、4个时序属性，在5,684,342个状态中零违规），经73个签名一致性向量验证。规范与实现：https://github.com/chelof100/acp-framework-en