Foundation Models (FMs) have become the hallmark of modern AI, however, these models are trained on massive data, leading to financially expensive training. Updating FMs as new data becomes available is important, however, can lead to `catastrophic forgetting', where models underperform on tasks related to data sub-populations observed too long ago. This continual learning (CL) phenomenon has been extensively studied, but primarily in a setting where only a small amount of past data can be stored. We advocate for the paradigm where memory is abundant, allowing us to keep all previous data, but computational resources are limited. In this setting, traditional replay-based CL approaches are outperformed by a simple baseline which replays past data selected uniformly at random, indicating that this setting necessitates a new approach. We address this by introducing a framework of adaptive memory replay for continual learning, where sampling of past data is phrased as a multi-armed bandit problem. We utilize Bolzmann sampling to derive a method which dynamically selects past data for training conditioned on the current task, assuming full data access and emphasizing training efficiency. Through extensive evaluations on both vision and language pre-training tasks, we demonstrate the effectiveness of our approach, which maintains high performance while reducing forgetting by up to 10% at no training efficiency cost.
翻译:基础模型(FMs)已成为现代人工智能的标志,然而,这些模型在海量数据上训练,导致训练成本高昂。随着新数据的不断出现,更新基础模型至关重要,但这可能引发“灾难性遗忘”——模型在长时间未观察到的数据子种群相关任务上表现不佳。这一持续学习(CL)现象已被广泛研究,但主要集中在只能存储少量历史数据的场景中。我们主张一种记忆资源充足、可保留全部历史数据但计算资源有限的范式。在此设定下,传统基于重放的持续学习方法被均匀随机选取历史数据的简单基线所超越,表明该场景需要新方法。为此,我们提出自适应记忆重放框架,将历史数据采样形式化为多臂老虎机问题。利用玻尔兹曼采样,推导出一种能够根据当前任务动态选取历史数据进行训练的方法,该方法假设完整数据访问权限并强调训练效率。通过在视觉和语言预训练任务上的广泛评估,我们证明了该方法的有效性:在不牺牲训练效率的前提下,它将遗忘率降低高达10%,同时保持高性能。