Evaluating AI Companies' Frontier Safety Frameworks: Methodology and Results

Following the AI Seoul Summit in 2024, twelve AI companies published frontier AI safety frameworks (Frameworks) outlining their approaches to managing catastrophic risks from advanced AI systems. Emerging legislation increasingly treats these Frameworks as external accountability mechanisms, incorporating them into reporting requirements. But what do the Frameworks actually commit each company to do? This study assesses 12 Frameworks, using 65 weighted criteria, across four dimensions: risk identification, risk analysis & evaluation, risk treatment, and risk governance. Our criteria adapt established risk management principles from other high-risk industries (e.g. aviation, nuclear power) to the frontier AI context, following Campos et al. (2025). Overall scores range from 34% (Anthropic) to 8% (Cohere), with a median of 18%. Many aspects are missing or under-specified. These low scores may be natural given the nascency of AI risk management compared to industries with decades of practice. The current Frameworks are limited as accountability functions, with vague commitments that make it difficult to predict company decisions, assess whether planned responses are adequate, or determine whether commitments have been kept. Higher scores appear feasible within current constraints: a company adopting all leading practices currently adopted across their peers would score 51%, almost triple the median.

翻译：继2024年首尔人工智能峰会后，12家AI公司发布了前沿AI安全框架（Frameworks），概述了其管理先进AI系统灾难性风险的方法。新兴立法逐渐将这些框架视为外部问责机制，并将其纳入报告要求。但这些框架实际承诺各公司采取哪些行动？本研究采用65项加权标准，从四个维度——风险识别、风险分析与评估、风险处理及风险治理——对12个框架进行了评估。我们的标准借鉴了其他高风险行业（如航空、核电）已确立的风险管理原则，并根据Campos等人（2025）的研究将其适配至前沿AI领域。总体得分范围从34%（Anthropic）到8%（Cohere），中位数为18%。许多方面存在缺失或表述不充分。鉴于AI风险管理相较于拥有数十年实践的行业尚处于起步阶段，这些低分可能属正常现象。当前的框架作为问责功能尚有限，其模糊承诺使得难以预测公司决策、评估计划应对措施是否充分，或确定承诺是否已兑现。在现有约束下，更高分数似可达成：一家采纳同行目前所有领先实践的公司，得分可达51%，几乎是中位数的三倍。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

《人工智能使能系统可靠性框架》

专知会员服务

20+阅读 · 4月27日

前沿人工智能趋势报告（Frontier AI Trends Report）

专知会员服务

39+阅读 · 2025年12月20日

《强大人工智能世界中维护安全：未来国防架构的考量》

专知会员服务

19+阅读 · 2025年11月28日

《人工智能安全治理框架》2.0版发布，90页pdf

专知会员服务

23+阅读 · 2025年10月8日