International AI Safety Report 2025: First Key Update: Capabilities and Risk Implications

Yoshua Bengio,Stephen Clare,Carina Prunkl,Shalaleh Rismani,Maksym Andriushchenko,Ben Bucknall,Philip Fox,Tiancheng Hu,Cameron Jones,Sam Manning,Nestor Maslej,Vasilios Mavroudis,Conor McGlynn,Malcolm Murray,Charlotte Stix,Lucia Velasco,Nicole Wheeler,Daniel Privitera,Sören Mindermann,Daron Acemoglu,Thomas G. Dietterich,Fredrik Heintz,Geoffrey Hinton,Nick Jennings,Susan Leavy,Teresa Ludermir,Vidushi Marda,Helen Margetts,John McDermid,Jane Munga,Arvind Narayanan,Alondra Nelson,Clara Neppel,Gopal Ramchurn,Stuart Russell,Marietje Schaake,Bernhard Schölkopf,Alavaro Soto,Lee Tiedrich,Gaël Varoquaux,Andrew Yao,Ya-Qin Zhang,Leandro Aguirre,Olubunmi Ajala,Fahad Albalawi Noora AlMalek,Christian Busch,André Carvalho,Jonathan Collas,Amandeep Gill,Ahmet Hatip,Juha Heikkilä,Chris Johnson,Gill Jolly,Ziv Katzir,Mary Kerema,Hiroaki Kitano,Antonio Krüger,Aoife McLysaght,Oleksii Molchanovskyi,Andrea Monti,Kyoung Mu Lee,Mona Nemer,Nuria Oliver,Raquel Pezoa,Audrey Plonk,José Portillo,Balaraman Ravindran,Hammam Riza,Crystal Rugege,Haroon Sheikh,Denise Wong,Yi Zeng,Liming Zhu

Since the publication of the first International AI Safety Report, AI capabilities have continued to improve across key domains. New training techniques that teach AI systems to reason step-by-step and inference-time enhancements have primarily driven these advances, rather than simply training larger models. As a result, general-purpose AI systems can solve more complex problems in a range of domains, from scientific research to software development. Their performance on benchmarks that measure performance in coding, mathematics, and answering expert-level science questions has continued to improve, though reliability challenges persist, with systems excelling on some tasks while failing completely on others. These capability improvements also have implications for multiple risks, including risks from biological weapons and cyber attacks. Finally, they pose new challenges for monitoring and controllability. This update examines how AI capabilities have improved since the first Report, then focuses on key risk areas where substantial new evidence warrants updated assessments.

翻译：自首份《国际人工智能安全报告》发布以来，人工智能在关键领域的能力持续提升。这些进展主要源于教导AI系统进行逐步推理的新型训练技术以及推理阶段的增强方案，而非简单地训练更大规模的模型。因此，通用人工智能系统能够解决从科学研究到软件开发等多个领域中更为复杂的问题。尽管在可靠性方面仍存在挑战——系统在某些任务上表现卓越而在其他任务上完全失败——但它们在衡量编码、数学及专家级科学问题解答能力的基准测试中持续取得进步。这些能力提升同时对多重风险产生影响，包括生物武器与网络攻击相关风险。最后，它们也为监测与可控性带来了新的挑战。本更新报告审视了自首份报告发布以来人工智能能力的发展状况，继而聚焦于具有重要新证据、需要更新评估的关键风险领域。

相关内容

关注 7107

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日