Continual learning has gained substantial attention within the deep learning community, offering promising solutions to the challenging problem of sequential learning. Yet, a largely unexplored facet of this paradigm is its susceptibility to adversarial attacks, especially with the aim of inducing forgetting. In this paper, we introduce "BrainWash," a novel data poisoning method tailored to impose forgetting on a continual learner. By adding the BrainWash noise to a variety of baselines, we demonstrate how a trained continual learner can be induced to forget its previously learned tasks catastrophically, even when using these continual learning baselines. An important feature of our approach is that the attacker requires no access to previous tasks' data and is armed merely with the model's current parameters and the data belonging to the most recent task. Our extensive experiments highlight the efficacy of BrainWash, showcasing degradation in performance across various regularization-based continual learning methods.
翻译:连续学习在深度学习领域引起了广泛关注,为序列学习这一具有挑战性的问题提供了有前景的解决方案。然而,该范式的一个尚未充分探索的方面是其对对抗攻击的脆弱性,尤其是以诱导遗忘为目标的攻击。在本文中,我们提出了一种名为“BrainWash”的新型数据投毒方法,专门用于诱使连续学习器产生遗忘。通过将BrainWash噪声添加到多种基线方法中,我们证明了即便使用了这些连续学习基线,经过训练的连续学习器仍可能被诱导灾难性地遗忘先前学习的任务。我们方法的一个重要特点是,攻击者无需访问先前任务的数据,仅需掌握模型当前参数及最近任务的数据。大量实验突出了BrainWash的有效性,展示了其在多种基于正则化的连续学习方法中导致的性能退化。