AI chatbots versus human healthcare professionals: a systematic review and meta-analysis of empathy in patient care

Background: Empathy is widely recognized for improving patient outcomes, including reduced pain and anxiety and improved satisfaction, and its absence can cause harm. Meanwhile, use of artificial intelligence (AI)-based chatbots in healthcare is rapidly expanding, with one in five general practitioners using generative AI to assist with tasks such as writing letters. Some studies suggest AI chatbots can outperform human healthcare professionals (HCPs) in empathy, though findings are mixed and lack synthesis. Sources of data: We searched multiple databases for studies comparing AI chatbots using large language models with human HCPs on empathy measures. We assessed risk of bias with ROBINS-I and synthesized findings using random-effects meta-analysis where feasible, whilst avoiding double counting. Areas of agreement: We identified 15 studies (2023-2024). Thirteen studies reported statistically significantly higher empathy ratings for AI, with only two studies situated in dermatology favouring human responses. Of the 15 studies, 13 provided extractable data and were suitable for pooling. Meta-analysis of those 13 studies, all utilising ChatGPT-3.5/4, showed a standardized mean difference of 0.87 (95% CI, 0.54-1.20) favouring AI (P < .00001), roughly equivalent to a two-point increase on a 10-point scale. Areas of controversy: Studies relied on text-based assessments that overlook non-verbal cues and evaluated empathy through proxy raters. Growing points: Our findings indicate that, in text-only scenarios, AI chatbots are frequently perceived as more empathic than human HCPs. Areas timely for developing research: Future research should validate these findings with direct patient evaluations and assess whether emerging voice-enabled AI systems can deliver similar empathic advantages.

翻译：背景：同理心被广泛认为能够改善患者结局，包括减轻疼痛和焦虑、提高满意度，而缺乏同理心可能造成伤害。与此同时，基于人工智能（AI）的聊天机器人在医疗保健领域的应用正在迅速扩展，五分之一的全科医生使用生成式AI协助完成诸如撰写信件等任务。一些研究表明，AI聊天机器人在同理心方面可能优于人类医疗专业人员（HCPs），但研究结果不一且缺乏整合。数据来源：我们检索了多个数据库，寻找使用大型语言模型的AI聊天机器人与人类HCPs在同理心测量方面进行比较的研究。我们使用ROBINS-I评估偏倚风险，并在可行的情况下使用随机效应荟萃分析整合研究结果，同时避免重复计算。共识领域：我们确定了15项研究（2023-2024年）。其中13项研究报告AI获得了统计学上显著更高的同理心评分，只有两项位于皮肤病学领域的研究显示人类回应更受青睐。在这15项研究中，13项提供了可提取数据，适合进行数据合并。对这13项全部使用ChatGPT-3.5/4的研究进行的荟萃分析显示，支持AI的标准化均数差为0.87（95% CI, 0.54-1.20）（P < .00001），大致相当于在10分量表上增加2分。争议领域：研究依赖于基于文本的评估，忽略了非语言线索，并通过代理评分者评估同理心。发展点：我们的研究结果表明，在纯文本场景中，AI聊天机器人经常被认为比人类HCPs更具同理心。亟待发展的研究领域：未来的研究应通过直接的患者评估来验证这些发现，并评估新兴的语音驱动AI系统是否能提供类似的同理心优势。

相关内容

关注 7107

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

《医疗领域中的具身人工智能综述：技术、应用与机会》

专知会员服务

29+阅读 · 2025年1月14日

什么是Data-Centric AI？Rice大学最新《以数据为中心的人工智能》研究综述，38页pdf全面阐述DCAI技术体系

专知会员服务

76+阅读 · 2023年3月21日

不可错过！斯坦福《人工智能医学健康》课程，全面阐述AI在医学的应用，附Slides

专知会员服务

50+阅读 · 2022年10月24日

《FUTURE-AI: 医学影像中可信人工智能的指导原则和共识建议》巴塞罗那大学等47页综述

专知会员服务

16+阅读 · 2022年7月28日