Large Language Models (LLMs) have demonstrated surprising performance on many tasks, including writing supportive messages that display empathy. Here, we had these models generate empathic messages in response to posts describing common life experiences, such as workplace situations, parenting, relationships, and other anxiety- and anger-eliciting situations. Across two studies (N=192, 202), we showed human raters a variety of responses written by several models (GPT4 Turbo, Llama2, and Mistral), and had people rate these responses on how empathic they seemed to be. We found that LLM-generated responses were consistently rated as more empathic than human-written responses. Linguistic analyses also show that these models write in distinct, predictable ``styles", in terms of their use of punctuation, emojis, and certain words. These results highlight the potential of using LLMs to enhance human peer support in contexts where empathy is important.
翻译:大型语言模型(LLMs)在许多任务中展现出令人惊讶的性能,包括撰写展现共情能力的支持性信息。本研究让这些模型针对描述常见生活经历(如职场情境、育儿、人际关系及其他引发焦虑或愤怒的情境)的帖子生成共情性回复。通过两项研究(N=192, 202),我们向人类评估者展示了由多个模型(GPT4 Turbo、Llama2及Mistral)生成的不同回复,并请他们对这些回复的共情程度进行评分。研究发现,LLM生成的回复始终比人类撰写的回复获得更高的共情评分。语言分析还表明,这些模型在标点符号、表情符号及特定词汇的使用上呈现出独特且可预测的“风格”。这些结果凸显了在需要共情的情境中,利用LLM增强人类同伴支持的潜力。