AI agents can now take irreversible actions in operational systems, but agent-caused losses are still not clearly assigned, priced, or transferred. Providers often disclaim consequential damages, users are left with uncompensated losses, and default human review limits the efficiency gains of automation. We ask when autonomous AI deployment can become economically acceptable despite failure risk. Our answer is to quantify risk at the customer-task-trace episode level and transfer it through insurance. Automation is acceptable when its expected benefit exceeds the premium, control cost, and remaining risk. This requires a defined role with bounded permissions and comparable traces. We introduce trace-economic underwriting, which maps tool-use traces to customer exposure and claimable loss, then uses this representation for pricing, control, and risk transfer. It uses deterministic economic labels rather than an LLM judge. In our trace-to-loss testbed, trace-economic pricing reduces pricing MAE from $17.7K to $569 and removes regressive cross-subsidy. A 300-trace expert audit accepts 295 labels unchanged. On 1,000 real SWE-smith traces, trace-conditioned controls reduce CVaR95 by 72%. Theorem~1 gives a finite-sample scope condition. We release code, labels, and audit sheets.
翻译:AI代理如今能够在运营系统中执行不可逆的操作,但由此导致的损失仍未得到明确分配、定价或转移。提供商往往免除间接损失责任,用户承担未补偿的损失,而默认的人工审核则限制了自动化的效率提升。我们探讨的是:尽管存在失败风险,自主AI部署何时能变得经济上可接受。我们的解决方案是在客户-任务-轨迹环节层面量化风险,并通过保险转移风险。当自动化带来的预期收益超过保费、控制成本及剩余风险时,该部署即可被接受。这需要定义明确的角色、有限的权限以及可比较的轨迹。我们提出轨迹经济承保方法,将工具使用轨迹映射至客户敞口与可索赔损失,进而利用该表示进行定价、控制与风险转移。该方法采用确定性经济标签而非大语言模型评判器。在我们的轨迹到损失测试平台中,轨迹经济定价将定价平均绝对误差从17700美元降至569美元,并消除了逆向交叉补贴。一项基于300条轨迹的专家审计接受了其中295个标签未经修改。在1000条真实SWE-smith轨迹上,轨迹条件化控制将CVaR95降低了72%。定理1给出了有限样本范围条件。我们已发布代码、标签及审计表格。