We study the problem of fitting a model to a dynamical environment when new modes of behavior emerge sequentially. The learning model is aware when a new mode appears, but it does not have access to the true modes of individual training sequences. The state-of-the-art continual learning approaches cannot handle this setup, because parameter transfer suffers from catastrophic interference and episodic memory design requires the knowledge of the ground-truth modes of sequences. We devise a novel continual learning method that overcomes both limitations by maintaining a descriptor of the mode of an encountered sequence in a neural episodic memory. We employ a Dirichlet Process prior on the attention weights of the memory to foster efficient storage of the mode descriptors. Our method performs continual learning by transferring knowledge across tasks by retrieving the descriptors of similar modes of past tasks to the mode of a current sequence and feeding this descriptor into its transition kernel as control input. We observe the continual learning performance of our method to compare favorably to the mainstream parameter transfer approach.
翻译:我们研究当新型行为模式顺序出现时,为动态环境拟合模型的问题。学习模型能够感知新模式的出现,但无法获知各训练序列的真实模式。现有最先进的持续学习方法无法处理此设定,因为参数迁移存在灾难性干扰问题,而情景记忆设计需要知晓序列的真实模式标签。我们提出一种新颖的持续学习方法,通过在神经情景记忆中维护所遇序列的模式描述子来克服上述两种局限性。我们采用狄利克雷过程作为记忆注意力权重的先验,以促进模式描述子的高效存储。该方法通过检索以往任务中相似模式的描述子并将其作为控制输入馈入当前序列的转移核,实现跨任务知识迁移的持续学习。实验表明,我们方法的持续学习性能优于主流参数迁移方法。