平衡、集成与基于实践的AI安全论证 (The BIG Argument for AI Safety Cases)

We present our Balanced, Integrated and Grounded (BIG) argument for assuring the safety of AI systems. The BIG argument adopts a whole-system approach to constructing a safety case for AI systems of varying capability, autonomy and criticality. Firstly, it is balanced by addressing safety alongside other critical ethical issues such as privacy and equity, acknowledging complexities and trade-offs in the broader societal impact of AI. Secondly, it is integrated by bringing together the social, ethical and technical aspects of safety assurance in a way that is traceable and accountable. Thirdly, it is grounded in long-established safety norms and practices, such as being sensitive to context and maintaining risk proportionality. Whether the AI capability is narrow and constrained or general-purpose and powered by a frontier or foundational model, the BIG argument insists on a systematic treatment of safety. Further, it places a particular focus on the novel hazardous behaviours emerging from the advanced capabilities of frontier AI models and the open contexts in which they are rapidly being deployed. These complex issues are considered within a wider AI safety case, approaching assurance from both technical and sociotechnical perspectives. Examples illustrating the use of the BIG argument are provided throughout the paper.

翻译：本文提出了一种平衡、集成与基于实践（BIG）的论证框架，用于确保人工智能系统的安全性。该论证框架采用全系统方法，为不同能力、自主性和关键性的人工智能系统构建安全案例。首先，其平衡性体现在将安全性与隐私、公平等其他关键伦理问题协同考量，承认人工智能更广泛社会影响中的复杂性与权衡关系。其次，其集成性体现在以可追溯、可问责的方式，将安全保障的社会、伦理与技术维度有机结合。第三，其基于实践的特性体现在遵循长期确立的安全规范与实践，例如对情境保持敏感并维持风险比例原则。无论人工智能能力是狭窄受限的，还是由前沿或基础模型驱动的通用型系统，BIG论证都坚持对安全性进行系统性处理。此外，该框架特别关注前沿AI模型先进能力所衍生的新型危险行为，以及这些模型快速部署的开放环境。这些复杂问题被置于更广泛的AI安全案例中加以考量，从技术和社会技术双重视角推进安全保障。本文通篇提供了阐释BIG论证应用的具体案例。

相关内容

关注 7093

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日