Toward Third-Party Assurance of AI Systems: Design Requirements, Prototype, and Early Testing

As Artificial Intelligence (AI) systems proliferate, the need for systematic, transparent, and actionable processes for evaluating them is growing. While many resources exist to support AI evaluation, they have several limitations. Few address both the process of designing, developing, and deploying an AI system and the outcomes it produces. Furthermore, few are end-to-end and operational, give actionable guidance, or present evidence of usability or effectiveness in practice. In this paper, we introduce a third-party AI assurance framework that addresses these gaps. We focus on third-party assurance to prevent conflict of interest and ensure credibility and accountability of the process. We begin by distinguishing assurance from audits in several key dimensions. Then, following design principles, we reflect on the shortcomings of existing resources to identify a set of design requirements for AI assurance. We then construct a prototype of an assurance process that consists of (1) a responsibility assignment matrix to determine the different levels of involvement each stakeholder has at each stage of the AI lifecycle, (2) an interview protocol for each stakeholder of an AI system, (3) a maturity matrix to assess AI systems' adherence to best practices, and (4) a template for an assurance report that draws from more mature assurance practices in business accounting. We conduct early validation of our AI assurance framework by applying the framework to two distinct AI use cases -- a business document tagging tool for downstream processing in a large private firm, and a housing resource allocation tool in a public agency -- and conducting expert validation interviews. Our findings show early evidence that our AI assurance framework is sound and comprehensive, usable across different organizational contexts, and effective at identifying bespoke issues with AI systems.

翻译：随着人工智能（AI）系统的普及，对系统性、透明且可操作的评估流程的需求日益增长。尽管已有诸多资源支持AI评估，但它们存在若干局限性。鲜有资源同时涵盖AI系统的设计、开发与部署过程及其产出结果。此外，极少资源能够提供端到端的可操作方案、给出具体实施指导，或在实践中展示其可用性与有效性的证据。本文提出一种第三方AI保障框架以弥补这些不足。我们聚焦于第三方保障，旨在避免利益冲突并确保流程的可信度与问责性。我们首先从多个关键维度区分保障与审计的差异。随后，依据设计原则，通过反思现有资源的缺陷，提出一套AI保障的设计需求。在此基础上，我们构建了一个保障流程原型，包含：（1）用于确定各利益相关方在AI生命周期各阶段参与程度的责任分配矩阵；（2）针对AI系统各利益相关方的访谈规程；（3）用于评估AI系统遵循最佳实践程度的成熟度矩阵；以及（4）借鉴商业会计中更成熟保障实践的保障报告模板。我们通过将框架应用于两个不同的AI用例——大型私营企业的下游处理业务文档标注工具，以及公共机构的住房资源分配工具——并开展专家验证访谈，对AI保障框架进行了早期验证。研究结果表明，我们的AI保障框架具备合理性与全面性，能够跨不同组织情境使用，并能有效识别AI系统的定制化问题。

相关内容

关注 7103

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

构建面向终端的 AI 编程智能体：脚手架、测试环境、上下文工程及实践经验

专知会员服务

20+阅读 · 3月8日

【博士论文】迈向负责任的人工智能：自主系统在安全性、公平性与可问责性方面的最新进展

专知会员服务

20+阅读 · 2025年6月15日

《理解决策主体对可竞争人工智能系统的需求和感知》最新262页论文

专知会员服务

27+阅读 · 2025年4月14日

国家标准《人工智能面向机器学习的系统规范（征求意见稿）》

专知会员服务

53+阅读 · 2024年5月25日