Frontier AI Regulation: Managing Emerging Risks to Public Safety

Markus Anderljung,Joslyn Barnhart,Anton Korinek,Jade Leung,Cullen O'Keefe,Jess Whittlestone,Shahar Avin,Miles Brundage,Justin Bullock,Duncan Cass-Beggs,Ben Chang,Tantum Collins,Tim Fist,Gillian Hadfield,Alan Hayes,Lewis Ho,Sara Hooker,Eric Horvitz,Noam Kolt,Jonas Schuett,Yonadav Shavit,Divya Siddarth,Robert Trager,Kevin Wolf

from arxiv, Update July 11th: - Added missing footnote back in. - Adjusted author order (mistakenly non-alphabetical among the first 6 authors) and adjusted affiliations (Jess Whittlestone's affiliation was mistagged and Gillian Hadfield had SRI added to her affiliations) Updated September 4th: Various typos

Advanced AI models hold the promise of tremendous benefits for humanity, but society needs to proactively manage the accompanying risks. In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety. Frontier AI models pose a distinct regulatory challenge: dangerous capabilities can arise unexpectedly; it is difficult to robustly prevent a deployed model from being misused; and, it is difficult to stop a model's capabilities from proliferating broadly. To address these challenges, at least three building blocks for the regulation of frontier models are needed: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models. Industry self-regulation is an important first step. However, wider societal discussions and government intervention will be needed to create standards and to ensure compliance with them. We consider several options to this end, including granting enforcement powers to supervisory authorities and licensure regimes for frontier AI models. Finally, we propose an initial set of safety standards. These include conducting pre-deployment risk assessments; external scrutiny of model behavior; using risk assessments to inform deployment decisions; and monitoring and responding to new information about model capabilities and uses post-deployment. We hope this discussion contributes to the broader conversation on how to balance public safety risks and innovation benefits from advances at the frontier of AI development.

翻译：先进人工智能模型有望为人类带来巨大利益，但社会需要主动管理随之而来的风险。本文重点关注我们称之为“前沿人工智能”模型：具有强大能力的基础模型，可能具备足以对公共安全构成严重威胁的危险能力。前沿人工智能模型带来了独特的监管挑战：危险能力可能意外出现；难以稳健地防止已部署模型被滥用；且难以阻止模型能力的广泛扩散。应对这些挑战，至少需要三个前沿模型监管的构建模块：（1）标准制定流程，以确定对前沿人工智能开发者的适当要求；（2）注册和报告要求，为监管机构提供对前沿人工智能开发过程的可见性；（3）确保在开发与部署前沿人工智能模型时遵守安全标准的机制。行业自律是重要的第一步。然而，需要更广泛的社会讨论和政府干预来制定标准并确保其得到遵守。为此，我们考虑了几种方案，包括授予监管机构执法权力以及建立前沿人工智能模型许可制度。最后，我们提出了一套初步的安全标准。这些标准包括：部署前进行风险评估；对模型行为进行外部审查；利用风险评估指导部署决策；以及部署后监控并应对关于模型能力和使用情况的新信息。我们希望这一讨论有助于更广泛地探讨如何在人工智能发展前沿的进步中平衡公共安全风险与创新效益。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日