Humans learn social skills through both imitation and social interaction. This social learning process is largely understudied by existing research on building language agents. Motivated by this gap, we propose an interactive learning method, SOTOPIA-$\pi$, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings. We show that our training method allows a 7B LLM to reach the social goal completion ability of an expert model (GPT-4-based agent), while improving the safety of language agents and maintaining general QA ability on the MMLU benchmark. We also find that this training paradigm uncovers some difficulties in LLM-based evaluation of social intelligence: LLM-based evaluators overestimate the abilities of the language agents trained specifically for social interaction.
翻译:人类通过模仿和社交互动两种方式学习社交技能。现有语言代理构建研究对此社交学习过程关注不足。基于这一空白,我们提出交互式学习方法SOTOPIA-$\pi$,用于提升语言代理的社交智能。该方法在大语言模型(LLM)评分的社交交互数据上进行行为克隆及自强化训练。研究表明,我们的训练方法可使7B参数规模的LLM达到专家模型(基于GPT-4的代理)的社交目标完成能力,同时提升语言代理的安全性,并在MMLU基准测试中保持通用问答能力。我们还发现该训练范式揭示了基于LLM的社交智能评估存在某些困难:LLM评估器会高估专门针对社交交互训练的语言代理的能力。