Inform, Coach, Relate, Listen: Auditing LLM Caregiving Support Roles

Language models are increasingly being deployed for conversational support in informal caregiving contexts, where interactions often extend beyond information-seeking: caregivers seek emotional reassurance, guidance, and help, while navigating uncertain, relationally complex care decisions. Yet most safety evaluations assess model behavior under generic prompts, leaving a critical question unexamined: does a model's safety profile change with its support role? We study this by operationalizing four expert-reviewed support roles grounded in social support theory: Inform, Coach, Relate, and Listen, and comparing them against two baseline controls: a basic prompting condition and a retrieval-augmented generation (RAG) condition. We evaluate across three language models (GPT-4o-mini, Llama-3.1-8B-Instruct, and MedGemma-1.5-4b-it) on 5,000 real-world queries from online Alzheimer's Disease and Related Dementias (ADRD) communities. We find that the LLM's support role systematically shapes both the prevalence and composition of interactional risks. Furthermore, a human evaluation study reveals a perceived quality--safety tension: more directive, information-oriented roles are rated as more helpful and trustworthy despite exhibiting elevated interactional risk profiles. We release ~90,000 support role-conditioned model responses with risk annotations as an ecologically grounded resource for research on safer LLM-mediated conversational support.

翻译：语言模型越来越多地被部署于非正式照护情境中的对话支持，在此类情境中，交互往往超越单纯的信息寻求：照护者在面对不确定且关系复杂的照护决策时，需要情感抚慰、指导与帮助。然而，多数安全评估仅在通用提示条件下评估模型行为，忽略了一个关键问题：模型的安全特征是否会因其支持角色的不同而发生变化？为此，我们基于社会支持理论，将四种经专家评审的支持角色操作化定义：告知、指导、共情与倾听，并与两种基线条件（基础提示条件与检索增强生成条件）进行对比。我们使用三个语言模型（GPT-4o-mini、Llama-3.1-8B-Instruct和MedGemma-1.5-4b-it），对来自在线阿尔茨海默病及相关痴呆症社区的5,000条真实世界查询进行评估。研究发现，大语言模型的支持角色系统性影响着交互风险的发生频率与构成类型。此外，一项人工评估揭示了感知质量与安全之间的张力：更具指导性、信息导向的角色尽管表现出更高的交互风险特征，却被评为更具帮助性和可信赖性。我们公开发布约90,000条附带风险标注的、按支持角色分类的模型响应，作为研究更安全的大语言模型媒介对话支持的生态有效资源。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

大型语言模型（LLM）智能体全栈安全的综述：数据、训练与部署

专知会员服务

33+阅读 · 2025年4月23日

《以人为中心的大型语言模型（LLM）研究综述》

专知会员服务

41+阅读 · 2024年11月25日

《大型语言模型情感认知》最新进展

专知会员服务

43+阅读 · 2024年10月3日

大语言模型中的提示隐私保护

专知会员服务

24+阅读 · 2024年7月24日