Practicing with Language Models Cultivates Human Empathic Communication

Empathy is central to human connection, yet people often struggle to express it effectively. In blinded evaluations, large language models (LLMs) generate responses that are often judged more empathic than human-written ones. Yet when a response is attributed to AI, recipients feel less heard and validated than when comparable responses are attributed to a human. To probe and address this gap in empathic communication skill, we built Lend an Ear, an experimental conversation platform in which participants are asked to offer empathic support to an LLM role-playing personal and workplace troubles. From 33,938 messages spanning 2,904 text-based conversations between 968 participants and their LLM conversational partners, we derive a data-driven taxonomy of idiomatic empathic expressions in naturalistic dialogue. Based on a pre-registered randomized experiment, we present evidence that a brief LLM coaching intervention offering personalized feedback on how to effectively communicate empathy significantly boosts alignment of participants' communication patterns with normative empathic communication patterns relative to both a control group and a group that received video-based but non-personalized feedback. Moreover, we find evidence for a silent empathy effect that people feel empathy but systematically fail to express it. Nonetheless, participants reliably identify responses aligned with normative empathic communication criteria as more expressive of empathy. Together, these results advance the scientific understanding of how empathy is expressed and valued and demonstrate a scalable, AI-based intervention for scaffolding and cultivating it.

翻译：共情是人类联结的核心，但人们往往难以有效表达。在盲评中，大型语言模型（LLMs）生成的回应通常被认为比人类撰写的回应更具共情力。然而，当回应被归因于人工智能时，接收者会感到比归因于人类的类似回应更少被倾听和认可。为探究并弥补共情沟通技能的差距，我们开发了"侧耳倾听"实验性对话平台，参与者需向扮演个人及职场困境的LLM提供共情支持。基于968名参与者与其LLM对话伙伴之间2,904次文本对话产生的33,938条消息，我们通过数据驱动的方式建立了自然对话中惯用共情表达的分类体系。根据预先注册的随机对照实验，我们提供的证据表明：相较于对照组和接收非个性化视频反馈的组别，通过LLM辅导干预提供关于如何有效传达共情的个性化反馈，能显著提升参与者沟通模式与规范性共情沟通模式的契合度。此外，我们发现了"沉默共情效应"的证据——人们能感知共情但系统性地未能表达。尽管如此，参与者仍能可靠地识别符合规范性共情沟通标准的回应具有更强的共情表达力。这些研究成果共同推进了对共情表达方式与价值判定的科学理解，并展示了一种可扩展的、基于人工智能的干预方法，用于构建和培养共情能力。