Autonomous agents can produce harmful behavioral patterns from individually valid requests. This class of threat cannot be addressed by per-request policy evaluation, because stateless engines evaluate each request in isolation and cannot enforce properties that depend on execution history. We present ACP, a temporal admission control protocol that enforces behavioral properties over execution traces by combining static risk scoring with stateful signals (anomaly accumulation, cooldown) through a LedgerQuerier abstraction that separates decision logic from state management. Under a 500-request workload where every request is individually valid (RS=35), a stateless engine approves all 500 requests. ACP limits autonomous execution to 2 out of 500 (0.4%), escalating after 3 actions and enforcing denial after 11. We identify a bounded state-mixing vulnerability where agent-level anomaly aggregation inadvertently elevates risk across unrelated contexts. ACP-RISK-3.0 resolves this by scoping temporal signals to (agentID, capability, resource), preserving enforcement within each context. We further identify deviation collapse: a degenerate regime in which enforcement is active but never exercised because upstream constraints eliminate the inputs required for DENIED decisions. We introduce Boundary Activation Rate (BAR) as a metric and counterfactual evaluation as a detection mechanism (Experiment 9: BAR drops from 0.70 to 0.00 under sanitization, restored to 1.00 via counterfactual injection). Decision latency: 767-921 ns (p50); throughput: 920,000 req/s. Safety and liveness model-checked via TLA+ (9 invariants, 4 temporal properties, 0 violations across 5,684,342 states), validated by 73 signed conformance vectors. Specification and implementation: https://github.com/chelof100/acp-framework-en
翻译:暂无翻译