Enterprise AI systems increasingly deploy multiple intelligent agents across mission-critical workflows that must satisfy hard policy constraints, bounded risk exposure, and comprehensive auditability (SOX, HIPAA, GDPR). Existing coordination methods - cooperative MARL, consensus protocols, and centralized planners - optimize expected reward while treating constraints implicitly. This paper introduces CAMCO (Constraint-Aware Multi-Agent Cognitive Orchestration), a runtime coordination layer that models multi-agent decision-making as a constrained optimization problem. CAMCO integrates three mechanisms: (i) a constraint projection engine enforcing policy-feasible actions via convex projection, (ii) adaptive risk-weighted Lagrangian utility shaping, and (iii) an iterative negotiation protocol with provably bounded convergence. Unlike training-time constrained RL, CAMCO operates as deployment-time middleware compatible with any agent architecture, with policy predicates designed for direct integration with production engines such as OPA. Evaluation across three enterprise scenarios - including comparison against a constrained Lagrangian MARL baseline - demonstrates zero policy violations, risk exposure below threshold (mean ratio 0.71), 92-97% utility retention, and mean convergence in 2.4 iterations.
翻译:企业AI系统越来越多地在关键工作流中部署多个智能体,这些工作流必须满足硬性策略约束、有限风险暴露和全面审计要求(SOX、HIPAA、GDPR)。现有协调方法——包括合作式多智能体强化学习、共识协议和集中式规划器——在优化期望收益时隐式处理约束。本文提出CAMCO(约束感知多智能体认知编排),这是一种将多智能体决策建模为约束优化问题的运行时协调层。CAMCO集成三种机制:(i)通过凸投影强制执行策略可行动作的约束投影引擎;(ii)自适应风险加权拉格朗日效用塑形;以及(iii)具有可证明有界收敛性的迭代协商协议。与训练时约束强化学习不同,CAMCO作为部署时中间件运行,兼容任意智能体架构,其策略谓词设计可直接集成OPA等生产引擎。在三种企业场景中的评估(包括与约束拉格朗日多智能体强化学习基线对比)表明:零策略违反、风险暴露低于阈值(均值比0.71)、效用保持92-97%,以及平均2.4次迭代内收敛。