Recent advancements in data-driven task-oriented dialogue systems (ToDs) struggle with incremental learning due to computational constraints and time-consuming issues. Continual Learning (CL) attempts to solve this by avoiding intensive pre-training, but it faces the problem of catastrophic forgetting (CF). While generative-based rehearsal CL methods have made significant strides, generating pseudo samples that accurately reflect the underlying task-specific distribution is still a challenge. In this paper, we present Dirichlet Continual Learning (DCL), a novel generative-based rehearsal strategy for CL. Unlike the traditionally used Gaussian latent variable in the Conditional Variational Autoencoder (CVAE), DCL leverages the flexibility and versatility of the Dirichlet distribution to model the latent prior variable. This enables it to efficiently capture sentence-level features of previous tasks and effectively guide the generation of pseudo samples. In addition, we introduce Jensen-Shannon Knowledge Distillation (JSKD), a robust logit-based knowledge distillation method that enhances knowledge transfer during pseudo sample generation. Our experiments confirm the efficacy of our approach in both intent detection and slot-filling tasks, outperforming state-of-the-art methods.
翻译:近年来,数据驱动的任务型对话系统在增量学习中面临计算约束和耗时问题的挑战。持续学习通过避免大规模预训练尝试解决这一问题,但面临灾难性遗忘的难题。尽管基于生成式回放的持续学习方法取得了显著进展,但生成能准确反映潜在任务特定分布的伪样本仍具挑战性。本文提出狄利克雷持续学习(DCL)——一种新颖的基于生成式回放的持续学习策略。与传统条件变分自编码器中采用高斯潜在变量不同,DCL利用狄利克雷分布的灵活性与通用性对潜在先验变量进行建模,从而高效捕捉先前任务的句子级特征,并有效引导伪样本的生成。此外,我们引入Jensen-Shannon知识蒸馏(JSKD),一种鲁棒的基于逻辑值的知识蒸馏方法,在伪样本生成过程中增强知识迁移。实验证明,该方法在意图检测与槽填充任务中均优于现有最先进方法,验证了其有效性。