A number of leading AI companies, including OpenAI, Google DeepMind, and Anthropic, have the stated goal of building artificial general intelligence (AGI) - AI systems that achieve or exceed human performance across a wide range of cognitive tasks. In pursuing this goal, they may develop and deploy AI systems that pose particularly significant risks. While they have already taken some measures to mitigate these risks, best practices have not yet emerged. To support the identification of best practices, we sent a survey to 92 leading experts from AGI labs, academia, and civil society and received 51 responses. Participants were asked how much they agreed with 50 statements about what AGI labs should do. Our main finding is that participants, on average, agreed with all of them. Many statements received extremely high levels of agreement. For example, 98% of respondents somewhat or strongly agreed that AGI labs should conduct pre-deployment risk assessments, dangerous capabilities evaluations, third-party model audits, safety restrictions on model usage, and red teaming. Ultimately, our list of statements may serve as a helpful foundation for efforts to develop best practices, standards, and regulations for AGI labs.
翻译:多家领先的AI公司,包括OpenAI、Google DeepMind和Anthropic,均明确宣称其目标是构建人工通用智能(AGI)——即在广泛认知任务中达到或超越人类表现的AI系统。在追求这一目标的过程中,它们可能开发并部署具有重大风险的AI系统。尽管这些公司已采取部分缓解措施,但最佳实践尚未形成。为助力最佳实践的确立,我们向来自AGI实验室、学术界及公民社会的92位顶尖专家发送了调查问卷,并收到51份回复。参与者被要求对50项关于AGI实验室应遵循事项的陈述表明同意程度。主要发现是:参与者平均同意所有陈述。多项陈述获得极高同意率。例如,98%的受访者部分同意或强烈同意:AGI实验室应开展部署前风险评估、危险能力评估、第三方模型审计、模型使用安全限制及红队测试。最终,我们的陈述清单可为制定AGI实验室的最佳实践、标准及法规提供有益基础。