Since the publication of the first International AI Safety Report, AI capabilities have continued to improve across key domains. New training techniques that teach AI systems to reason step-by-step and inference-time enhancements have primarily driven these advances, rather than simply training larger models. As a result, general-purpose AI systems can solve more complex problems in a range of domains, from scientific research to software development. Their performance on benchmarks that measure performance in coding, mathematics, and answering expert-level science questions has continued to improve, though reliability challenges persist, with systems excelling on some tasks while failing completely on others. These capability improvements also have implications for multiple risks, including risks from biological weapons and cyber attacks. Finally, they pose new challenges for monitoring and controllability. This update examines how AI capabilities have improved since the first Report, then focuses on key risk areas where substantial new evidence warrants updated assessments.
翻译:自首份《国际人工智能安全报告》发布以来,人工智能在关键领域的能力持续提升。这些进展主要源于教导AI系统进行逐步推理的新型训练技术以及推理阶段的增强方案,而非简单地训练更大规模的模型。因此,通用人工智能系统能够解决从科学研究到软件开发等多个领域中更为复杂的问题。尽管在可靠性方面仍存在挑战——系统在某些任务上表现卓越而在其他任务上完全失败——但它们在衡量编码、数学及专家级科学问题解答能力的基准测试中持续取得进步。这些能力提升同时对多重风险产生影响,包括生物武器与网络攻击相关风险。最后,它们也为监测与可控性带来了新的挑战。本更新报告审视了自首份报告发布以来人工智能能力的发展状况,继而聚焦于具有重要新证据、需要更新评估的关键风险领域。