Large language models have demonstrated significant capabilities in generating diverse and context-aware responses for empathetic dialogue. However, their computational demands severely limit their deployment in resource-constrained environments. While knowledge distillation offers a promising compression solution, it often fails to transfer the nuanced understanding essential for empathy, as it overlooks the implicit contextual cues that guide human connection. To bridge this gap, we propose a \textbf{pr}ivileged \textbf{i}nformation-enhanced knowledge \textbf{d}istillation method for \textbf{e}mpathetic dialogue generation (PRIDE). Our method leverages privileged information, such as expert psychological annotations or future event summaries, which is available exclusively during training but unavailable at inference time. This allows us to transfer the teacher model's empathetic reasoning to smaller models without relying on extra inputs during deployment. Specifically, PRIDE has three key components: (1) An empathy-reasoning prompt that guides the teacher to explicitly decompose the empathetic process into understanding feelings and analyzing situations step-by-step; (2) A multi-source attention mechanism that directs the student to effectively integrate privileged information; (3) A dual-alignment loss that combines reversed Kullback-Leibler divergence and maximum mean discrepancy to ensure robust knowledge transfer at both logit and feature levels. Experiments on multi-modal and text-only datasets demonstrate that our method achieves competitive performance, and in some cases matches or even surpasses larger teacher models in terms of accuracy and semantic relevance.
翻译:暂无翻译