Context-Emotion Aware Therapeutic Dialogue Generation: A Multi-component Reinforcement Learning Approach to Language Models for Mental Health Support

Mental health disorders impose a substantial global socioeconomic burden. While large language models (LLMs) offer 24/7, non-judgmental interactions to address this gap, pretrained models lack contextual coherence and emotional alignment for appropriate therapeutic dialogue. Existing methods suffer from three critical methodological gaps: 1) Supervised Fine-Tuning (SFT) produces repetitive, context-insensitive outputs that fail to balance clinical accuracy with genuine empathy; 2) Reinforcement Learning (RL)-based therapeutic systems rely on generic reward functions (e.g., BLEU, ROUGE) that prioritise lexical similarity over clinical-specific emotional appropriateness and contextual relevance; 3) LLMs are resource-intensive and pose data privacy risks, making local deployment in clinical settings infeasible. To address these gaps, this study investigates the application of SFT and RL techniques to enhance GPT-2's capacity for therapeutic dialogue generation. The methodology restructured input formats to enable simultaneous processing of contextual information and emotional states alongside user input, employing a novel multi-component reward function that explicitly aligns model outputs with professional therapeutic logic (not just lexical overlap) and annotated emotions. Results demonstrated substantial improvements through RLs over baseline GPT-2 across multiple evaluation metrics: BLEU (0.0111), ROUGE-1 (0.1397), ROUGE-2 (0.0213), ROUGE-L (0.1317), and METEOR (0.0581). LLM evaluation confirmed high contextual relevance and professionalism, while RL achieved 99.34% emotion accuracy compared to 66.96% for baseline GPT-2. These findings demonstrate RL's effectiveness in developing therapeutic dialogue systems that can serve as valuable assistive tools for therapists, while maintaining essential human clinical oversight.

翻译：心理健康障碍造成了巨大的全球社会经济负担。虽然大型语言模型（LLMs）能够提供全天候、无评判的互动以弥补这一缺口，但预训练模型缺乏生成合适治疗性对话所需的上下文连贯性和情感一致性。现有方法存在三个关键的方法论缺陷：1）监督微调（SFT）会产生重复且对上下文不敏感的输出，无法在临床准确性与真实共情之间取得平衡；2）基于强化学习（RL）的治疗系统依赖于通用奖励函数（如BLEU、ROUGE），这些函数优先考虑词汇相似性，而非临床特定的情感适当性和上下文相关性；3）LLMs资源密集且存在数据隐私风险，使得在临床环境中本地部署不可行。为应对这些缺陷，本研究探讨了应用SFT和RL技术来增强GPT-2生成治疗性对话的能力。该方法重构了输入格式，使其能够同时处理上下文信息、情感状态以及用户输入，并采用了一种新颖的多组件奖励函数，该函数明确地将模型输出与专业治疗逻辑（而不仅仅是词汇重叠）以及标注的情感对齐。结果表明，与基线GPT-2相比，RL在多个评估指标上带来了显著改进：BLEU（0.0111）、ROUGE-1（0.1397）、ROUGE-2（0.0213）、ROUGE-L（0.1317）和METEOR（0.0581）。LLM评估证实了较高的上下文相关性和专业性，同时RL实现了99.34%的情感准确率，而基线GPT-2为66.96%。这些发现证明了RL在开发治疗性对话系统方面的有效性，该系统可以作为治疗师有价值的辅助工具，同时保持必要的人类临床监督。