Continual learning has gained substantial attention within the deep learning community, offering promising solutions to the challenging problem of sequential learning. Yet, a largely unexplored facet of this paradigm is its susceptibility to adversarial attacks, especially with the aim of inducing forgetting. In this paper, we introduce "BrainWash," a novel data poisoning method tailored to impose forgetting on a continual learner. By adding the BrainWash noise to a variety of baselines, we demonstrate how a trained continual learner can be induced to forget its previously learned tasks catastrophically, even when using these continual learning baselines. An important feature of our approach is that the attacker requires no access to previous tasks' data and is armed merely with the model's current parameters and the data belonging to the most recent task. Our extensive experiments highlight the efficacy of BrainWash, showcasing degradation in performance across various regularization-based continual learning methods.
翻译:持续学习在深度学习社区中获得了广泛关注,为序列学习这一挑战性问题提供了有前景的解决方案。然而,该范式一个在很大程度上尚未被探索的方面是其对对抗性攻击的脆弱性,尤其是以诱发遗忘为目标的攻击。在本文中,我们提出了一种名为“BrainWash”的新型数据投毒方法,专门用于在持续学习器上强制实现遗忘。通过将BrainWash噪声添加到多种基线方法中,我们展示了即使在使用这些持续学习基线的情况下,一个经过训练的持续学习器如何被诱导灾难性地遗忘其先前学过的任务。我们方法的一个重要特点是,攻击者无需访问先前任务的数据,仅凭模型的当前参数和属于最新任务的数据即可发动攻击。我们的大量实验凸显了BrainWash的有效性,展示了其在各种基于正则化的持续学习方法上导致的性能退化。