SARC: A Governance-by-Architecture Framework for Agentic AI Systems

Agentic AI systems increasingly act through tools, sub-agents, and external services, but governance controls are still commonly attached to prompts, dashboards, or post-hoc documentation. This creates a structural mismatch in regulated settings: obligations that must constrain execution are often evaluated only after execution has occurred. We introduce SARC, a runtime governance architecture for tool-using agents that treats constraints as first-class specification objects alongside state, action space, and reward. A SARC specification declares each constraint's source, class, predicate, verification point, response protocol, and operating point, and compiles these into four enforcement sites in the agent loop: a Pre-Action Gate, an Action-Time Monitor, a Post-Action Auditor, and an Escalation Router. We formalize the minimal invariants required for specification-trace correspondence, show why finite reward penalties do not generally substitute for hard runtime constraints, and extend the architecture to multi-agent workflows through constraint propagation, authority intersection, and attribution-preserving trace trees. We implement a prototype audit checker and report a reproducible synthetic evaluation over 50 seeds comparing SARC against post-hoc audit, output filtering, workflow rules, and policy-as-code-only baselines on a procurement task. SARC executes zero hard-constraint violations under exact predicates; its declared PAA throttling response reduces soft-window overages by 89.5% relative to policy-as-code-only. Predicate-noise and enforcement-failure sweeps are consistent with the claim that residual hard violations under SARC scale with enforcement-stack error rather than environmental violation opportunity. SARC provides the architectural substrate through which obligations can be made executable, inspectable, and auditable at runtime.

翻译：智能体AI系统日益通过工具、子智能体和外部服务运作，但治理控制仍普遍附着于提示词、仪表盘或事后文档中。这造成了受监管场景中的结构性错配：必须约束执行的义务往往在执行完成后才被评估。我们提出SARC，一种面向使用工具的智能体的运行时治理架构，该架构将约束视为与状态、动作空间和奖励同等重要的第一类规范对象。SARC规范声明每个约束的源、类别、谓词、验证点、响应协议和操作点，并将这些编译到智能体循环中的四个强制站点：前置动作门控、动作时监控、事后审计器和升级路由。我们形式化了规范-轨迹对应所需的最小不变性，证明了有限奖励惩罚通常无法替代硬性运行时约束，并将该架构通过约束传播、权限交叉和属性保留轨迹树扩展到多智能体工作流。我们实现了一个原型审计检查器，并报告了在采购任务上基于50个随机种子的可复现合成评估结果，将SARC与事后审计、输出过滤、工作流规则和仅策略代码基线进行对比。在精确谓词下，SARC实现了零硬约束违规；其声明的PAA节流响应使得软窗口超量相比仅策略代码降低89.5%。谓词噪声和强制执行失效扫描的结果与以下论断一致：SARC下的残余硬违规随强制执行栈错误而非环境违规机会扩展。SARC提供了使义务在运行时可执行、可检查和可审计的架构基础。