Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

Sarah Shoker,Andrew Reddie,Sarah Barrington,Ruby Booth,Miles Brundage,Husanjot Chahal,Michael Depp,Bill Drexel,Ritwik Gupta,Marina Favaro,Jake Hecla,Alan Hickey,Margarita Konaev,Kirthi Kumar,Nathan Lambert,Andrew Lohn,Cullen O'Keefe,Nazneen Rajani,Michael Sellitto,Robert Trager,Leah Walker,Alexa Wehsener,Jessica Young

Foundation models could eventually introduce several pathways for undermining state security: accidents, inadvertent escalation, unintentional conflict, the proliferation of weapons, and the interference with human diplomacy are just a few on a long list. The Confidence-Building Measures for Artificial Intelligence workshop hosted by the Geopolitics Team at OpenAI and the Berkeley Risk and Security Lab at the University of California brought together a multistakeholder group to think through the tools and strategies to mitigate the potential risks introduced by foundation models to international security. Originating in the Cold War, confidence-building measures (CBMs) are actions that reduce hostility, prevent conflict escalation, and improve trust between parties. The flexibility of CBMs make them a key instrument for navigating the rapid changes in the foundation model landscape. Participants identified the following CBMs that directly apply to foundation models and which are further explained in this conference proceedings: 1. crisis hotlines 2. incident sharing 3. model, transparency, and system cards 4. content provenance and watermarks 5. collaborative red teaming and table-top exercises and 6. dataset and evaluation sharing. Because most foundation model developers are non-government entities, many CBMs will need to involve a wider stakeholder community. These measures can be implemented either by AI labs or by relevant government actors.

翻译：基础模型最终可能通过多种途径威胁国家安全：事故、无意升级、意外冲突、武器扩散以及干扰人类外交等仅是其中少数几例。由OpenAI地缘政治团队与加州大学伯克利风险与安全实验室联合举办的"人工智能信任构建措施"研讨会，汇聚多方利益相关者，共同探讨减轻基础模型对国际安全潜在风险的工具与策略。信任构建措施（CBMs）源于冷战时期，是指减少敌对情绪、防止冲突升级、增进各方信任的行动。其灵活性使其成为应对基础模型领域快速变化的关键工具。参会者识别出以下直接适用于基础模型的信任构建措施，并在本会议论文集中进一步阐释：1. 危机热线 2. 事件共享 3. 模型、透明度及系统卡片 4. 内容溯源与数字水印 5. 协同红队演练与桌面推演 6. 数据集与评估共享。由于大多数基础模型开发者是非政府实体，许多信任构建措施需要更广泛利益相关方的参与。这些措施可由人工智能实验室或相关政府行为者实施。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日