ControlLM: Crafting Diverse Personalities for Language Models

As language models continue to scale in size and capability, they display an array of emerging behaviors, both beneficial and concerning. This heightens the need to control model behaviors. We hope to be able to control the personality traits of language models at the inference-time so as to have various character features, on top of which the requirements of different types of tasks can be met. Personality is a higher-level and more abstract behavioral representation for language models. We introduce ControlLM, which leverages differential activation patterns, derived from contrasting behavioral prompts in the model's latent space, to influence the model's personality traits at inference. This approach allows for the precise, real-time adjustment of model behavior. First, we demonstrate ControlLM's capacity to elicit diverse persona behaviors without any training, while precision control allows personality traits to closely match average human values. Subsequently, we showcase improved reasoning and question answering through selective amplification of beneficial attributes like conscientiousness and friendliness. We hope that this work will inspire research on controlling human-like behaviors of language models and provide insights for future research. Our code is publicly available at: https://github.com/wengsyx/ControlLM.

翻译：随着语言模型在规模和能力上持续扩展，它们展现出各种涌现行为，既有有益的也有令人担忧的。这加剧了对模型行为进行控制的需求。我们希望在推理时能够控制语言模型的人格特质，从而拥有多样的角色特征，以满足不同类型任务的需求。人格是语言模型更高层次且更抽象的行为表征。我们提出ControlLM，该方法利用对比行为提示在模型潜在空间中产生的差异激活模式，在推理时影响模型的人格特质。这种方法能够精确、实时地调整模型行为。首先，我们展示了ControlLM在无需训练的情况下引发多样化角色行为的能力，同时精确控制使人格特质紧密匹配平均人类价值观。随后，我们通过选择性放大尽责性和友好性等有益属性，展示了改进的推理和问答能力。我们希望这项工作能激发对语言模型类人行为控制的研究，并为未来研究提供启示。我们的代码已公开在：https://github.com/wengsyx/ControlLM。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日