Continual Learning with Dirichlet Generative-based Rehearsal

Recent advancements in data-driven task-oriented dialogue systems (ToDs) struggle with incremental learning due to computational constraints and time-consuming issues. Continual Learning (CL) attempts to solve this by avoiding intensive pre-training, but it faces the problem of catastrophic forgetting (CF). While generative-based rehearsal CL methods have made significant strides, generating pseudo samples that accurately reflect the underlying task-specific distribution is still a challenge. In this paper, we present Dirichlet Continual Learning (DCL), a novel generative-based rehearsal strategy for CL. Unlike the traditionally used Gaussian latent variable in the Conditional Variational Autoencoder (CVAE), DCL leverages the flexibility and versatility of the Dirichlet distribution to model the latent prior variable. This enables it to efficiently capture sentence-level features of previous tasks and effectively guide the generation of pseudo samples. In addition, we introduce Jensen-Shannon Knowledge Distillation (JSKD), a robust logit-based knowledge distillation method that enhances knowledge transfer during pseudo sample generation. Our experiments confirm the efficacy of our approach in both intent detection and slot-filling tasks, outperforming state-of-the-art methods.

翻译：近年来，数据驱动的任务型对话系统在增量学习中面临计算约束和耗时问题的挑战。持续学习通过避免大规模预训练尝试解决这一问题，但面临灾难性遗忘的难题。尽管基于生成式回放的持续学习方法取得了显著进展，但生成能准确反映潜在任务特定分布的伪样本仍具挑战性。本文提出狄利克雷持续学习（DCL）——一种新颖的基于生成式回放的持续学习策略。与传统条件变分自编码器中采用高斯潜在变量不同，DCL利用狄利克雷分布的灵活性与通用性对潜在先验变量进行建模，从而高效捕捉先前任务的句子级特征，并有效引导伪样本的生成。此外，我们引入Jensen-Shannon知识蒸馏（JSKD），一种鲁棒的基于逻辑值的知识蒸馏方法，在伪样本生成过程中增强知识迁移。实验证明，该方法在意图检测与槽填充任务中均优于现有最先进方法，验证了其有效性。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日