Continual learning has gained substantial attention within the deep learning community, offering promising solutions to the challenging problem of sequential learning. Yet, a largely unexplored facet of this paradigm is its susceptibility to adversarial attacks, especially with the aim of inducing forgetting. In this paper, we introduce "BrainWash," a novel data poisoning method tailored to impose forgetting on a continual learner. By adding the BrainWash noise to a variety of baselines, we demonstrate how a trained continual learner can be induced to forget its previously learned tasks catastrophically, even when using these continual learning baselines. An important feature of our approach is that the attacker requires no access to previous tasks' data and is armed merely with the model's current parameters and the data belonging to the most recent task. Our extensive experiments highlight the efficacy of BrainWash, showcasing degradation in performance across various regularization-based continual learning methods.
翻译:持续学习在深度学习领域引起了广泛关注,为序列学习这一具有挑战性的问题提供了有前景的解决方案。然而,这一范式的一个很大程度上未被探索的方面是其对对抗攻击的脆弱性,尤其是在诱导遗忘方面。在本文中,我们提出了一种名为“BrainWash”的新型数据投毒方法,专门用于在持续学习者中强制遗忘。通过将BrainWash噪声添加到多种基线方法中,我们展示了即使在使用这些持续学习基线方法时,经过训练的持续学习者也可能被诱导灾难性地遗忘其先前学习的任务。我们方法的一个重要特点是攻击者无需访问先前任务的数据,仅需利用模型的当前参数和属于最新任务的数据。我们的大量实验突显了BrainWash的有效性,展示了在各种基于正则化的持续学习方法中的性能下降。