Prior work on trustworthy AI emphasizes model-internal properties such as bias mitigation, adversarial robustness, and interpretability. As AI systems evolve into autonomous agents deployed in open environments and increasingly connected to payments or assets, the operational meaning of trust shifts to end-to-end outcomes: whether an agent completes tasks, follows user intent, and avoids failures that cause material or psychological harm. These risks are fundamentally product-level and cannot be eliminated by technical safeguards alone because agent behavior is inherently stochastic. To address this gap between model-level reliability and user-facing assurance, we propose a complementary framework based on risk management. Drawing inspiration from financial underwriting, we introduce the \textbf{Agentic Risk Standard (ARS)}, a payment settlement standard for AI-mediated transactions. ARS integrates risk assessment, underwriting, and compensation into a single transaction framework that protects users when interacting with agents. Under ARS, users receive predefined and contractually enforceable compensation in cases of execution failure, misalignment, or unintended outcomes. This shifts trust from an implicit expectation about model behavior to an explicit, measurable, and enforceable product guarantee. We also present a simulation study analyzing the social benefits of applying ARS to agentic transactions. ARS's implementation can be found at https://github.com/t54-labs/AgenticRiskStandard.
翻译:先前关于可信AI的研究主要关注模型内部属性,如偏差缓解、对抗鲁棒性和可解释性。随着AI系统演变为部署在开放环境中、并与支付或资产日益相连的自主代理,信任的操作性含义转向端到端结果:代理是否完成任务、遵循用户意图、并避免造成物质或心理伤害的故障。这些风险本质上是产品层面的,无法仅通过技术保障消除,因为代理行为天然具有随机性。为弥合模型级可靠性与面向用户的保障之间的差距,我们提出一种基于风险管理的补充框架。借鉴金融核保的思路,我们引入**代理风险标准(ARS)**,这是一种面向AI中介交易的支付结算标准。ARS将风险评估、核保和补偿整合至单一交易框架中,在用户与代理交互时为其提供保护。在ARS下,若发生执行失败、目标偏离或意外结果,用户可获得预先定义且具有合同约束力的补偿。这将信任从对模型行为的隐含期望,转变为明确、可衡量且可执行的产吕保障。我们还通过仿真研究分析了将ARS应用于代理交易的社会效益。ARS的实现可在https://github.com/t54-labs/AgenticRiskStandard 获取。