SurveyLM: A platform to explore emerging value perspectives in augmented language models' behaviors

This white paper presents our work on SurveyLM, a platform for analyzing augmented language models' (ALMs) emergent alignment behaviors through their dynamically evolving attitude and value perspectives in complex social contexts. Social Artificial Intelligence (AI) systems, like ALMs, often function within nuanced social scenarios where there is no singular correct response, or where an answer is heavily dependent on contextual factors, thus necessitating an in-depth understanding of their alignment dynamics. To address this, we apply survey and experimental methodologies, traditionally used in studying social behaviors, to evaluate ALMs systematically, thus providing unprecedented insights into their alignment and emergent behaviors. Moreover, the SurveyLM platform leverages the ALMs' own feedback to enhance survey and experiment designs, exploiting an underutilized aspect of ALMs, which accelerates the development and testing of high-quality survey frameworks while conserving resources. Through SurveyLM, we aim to shed light on factors influencing ALMs' emergent behaviors, facilitate their alignment with human intentions and expectations, and thereby contributed to the responsible development and deployment of advanced social AI systems. This white paper underscores the platform's potential to deliver robust results, highlighting its significance to alignment research and its implications for future social AI systems.

翻译：本白皮书介绍了我们在SurveyLM平台上的工作，该平台旨在通过分析增强型语言模型（ALMs）在复杂社会情境下动态演化的态度与价值视角，研究其新兴对齐行为。诸如ALMs等社会性人工智能（AI）系统通常运行于细微复杂的社会场景中，其中既不存在唯一正确的回应，答案亦高度依赖情境因素，因此亟需深入理解其对齐动力学。为此，我们采用传统上用于研究社会行为的调查与实验方法，系统性地评估ALMs，从而为对齐机制及其新兴行为提供前所未有的洞见。此外，SurveyLM平台利用ALMs自身的反馈来优化调查与实验设计，挖掘ALMs中未被充分利用的潜在能力，从而在节省资源的同时加速高质量调查框架的研发与测试。通过SurveyLM，我们旨在揭示影响ALMs新兴行为的因素，促进其与人类意图及期望的对齐，进而为先进社会性AI系统的负责任开发与部署做出贡献。本白皮书强调了该平台在提供稳健结果方面的潜力，阐明了其对对齐研究的重要性及其对未来社会性AI系统的深远影响。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日