Amnesia: A Stealthy Replay Attack on Continual Learning Dreams

Continual learning (CL) models often use experience replay to reduce catastrophic forgetting, but their robustness to replay sampling interference remains underexplored. Existing CL attacks alter inputs or training pipelines (poisoning/backdoors) and rarely include explicit auditable constraints, limiting realism. Here, auditability means a monitor can verify compliance from sampler-visible telemetry - e.g., logged replay index/label statistics - by checking that the realized replay class histogram stays close to a nominal baseline and that replay rate is unchanged per batch and/or over a rolling window. We study a limited-privilege insider who controls only replay index selection, not pixels, labels, or model parameters, while staying within auditable limits such as queue priorities. We introduce Amnesia, a replay composition attack that maximizes degradation under two budgets: a visibility budget delta bounding the TV/KL divergence from a nominal class histogram p0, and a mass budget f fixing the replay rate. Amnesia has two steps: (i) compute lightweight class utilities, such as EMA loss or confidence, to tilt p0 toward harmful classes; and (ii) project the tilt back into the delta-ball using efficient KL (exponential tilt) or TV (balanced mass redistribution) optimizers. A windowed scheduler enforces rolling audits. Across challenging CL benchmarks and strong replay baselines, Amnesia consistently lowers final accuracy (ACC) and worsens backward transfer (-BWT). The KL variant delivers high impact while remaining largely undetected under multiple audit schemes, including per-batch and rolling-window checks. The TV variant is more damaging but easier to detect, especially under tight per-class constraints. These results expose index-only replay control as a practical, auditable threat surface in CL systems and establish a principled impact-visibility trade-off.

翻译：持续学习（CL）模型常采用经验重放来减轻灾难性遗忘，但其对重放采样干扰的鲁棒性仍未得到充分探索。现有CL攻击通过篡改输入或训练流程（投毒/后门）实施，且极少包含显式可审计约束，限制了攻击的现实性。本文中，"可审计性"指监控器可通过采样器可见的遥测数据（例如记录的重放索引/标签统计量）验证合规性——具体通过检查实际重放类分布直方图是否接近名义基线、以及重放率在单批次和/或滑动窗口内是否保持不变来实现。我们研究一种低权限内部攻击者，其仅控制重放索引选择（不操纵像素、标签或模型参数），同时严格遵守队列优先级等可审计约束。我们提出Amnesia——一种在两种预算限制下最大化性能衰退的重放组合攻击：可见性预算δ约束与名义类分布直方图p0之间的TV/KL散度，以及数量预算f固定重放率。Amnesia包含两步：(i) 计算轻量级类别效用值（如EMA损失或置信度），使p0向有害类别倾斜；(ii) 通过高效KL优化器（指数倾斜）或TV优化器（均衡质量重分配）将倾斜分布投影回δ-球内部。滑动窗口调度器保障滚动审计机制。在具有挑战性的CL基准测试和强重放基线中，Amnesia持续降低最终准确率（ACC）并加剧负向迁移（-BWT）。KL变体在保持高影响力的同时，在多审计方案（包括单批次和滑动窗口检查）下仍能保持高度隐蔽。TV变体破坏性更强但更易被检测，尤其在严格的逐类约束条件下。这些结果揭示了仅控制索引的重放机制是CL系统中一种切实可行的可审计攻击面，并建立了原则性的影响-可见性权衡关系。