Large language models (LLMs) have been actively applied in the mental health field. Recent research shows the promise of LLMs in applying psychotherapy, especially motivational interviewing (MI). However, there is a lack of studies investigating how language models understand MI ethics. Given the risks that malicious actors can use language models to apply MI for unethical purposes, it is important to evaluate their capability of differentiating ethical and unethical MI practices. Thus, this study investigates the ethical awareness of LLMs in MI with multiple experiments. Our findings show that LLMs have a moderate to strong level of knowledge in MI. However, their ethical standards are not aligned with the MI spirit, as they generated unethical responses and performed poorly in detecting unethical responses. We proposed a Chain-of-Ethic prompt to mitigate those risks and improve safety. Finally, our proposed strategy effectively improved ethical MI response generation and detection performance. These findings highlight the need for safety evaluations and guidelines for building ethical LLM-powered psychotherapy.
翻译:大型语言模型(LLMs)在心理健康领域已得到积极应用。近期研究表明,LLMs在实施心理治疗——特别是动机性访谈(MI)方面展现出潜力。然而,目前缺乏针对语言模型如何理解MI伦理的研究。鉴于恶意行为者可能利用语言模型实施非伦理MI行为的风险,评估其区分伦理与非伦理MI实践的能力至关重要。为此,本研究通过多组实验探究了LLMs在MI中的伦理意识。研究发现:LLMs对MI知识掌握程度达到中高水平,但其伦理标准与MI精神存在偏差——模型会生成非伦理回应,且在检测非伦理回应时表现不佳。我们提出"伦理链"提示策略以降低风险并提升安全性。最终,该策略有效改善了伦理MI回应的生成质量与检测性能。这些发现凸显了构建伦理化LLM心理治疗系统时开展安全评估与制定指导准则的必要性。