The Class Incremental Semantic Segmentation (CISS) extends the traditional segmentation task by incrementally learning newly added classes. Previous work has introduced generative replay, which involves replaying old class samples generated from a pre-trained GAN, to address the issues of catastrophic forgetting and privacy concerns. However, the generated images lack semantic precision and exhibit out-of-distribution characteristics, resulting in inaccurate masks that further degrade the segmentation performance. To tackle these challenges, we propose DiffusePast, a novel framework featuring a diffusion-based generative replay module that generates semantically accurate images with more reliable masks guided by different instructions (e.g., text prompts or edge maps). Specifically, DiffusePast introduces a dual-generator paradigm, which focuses on generating old class images that align with the distribution of downstream datasets while preserving the structure and layout of the original images, enabling more precise masks. To adapt to the novel visual concepts of newly added classes continuously, we incorporate class-wise token embedding when updating the dual-generator. Moreover, we assign adequate pseudo-labels of old classes to the background pixels in the new step images, further mitigating the forgetting of previously learned knowledge. Through comprehensive experiments, our method demonstrates competitive performance across mainstream benchmarks, striking a better balance between the performance of old and novel classes.
翻译:类增量语义分割(CISS)通过在增量学习中添加新类别,扩展了传统的分割任务。先前的工作引入了生成式回放,即利用预训练生成对抗网络生成的旧类样本进行回放,以解决灾难性遗忘和隐私问题。然而,生成的图像缺乏语义精度且存在分布外特性,导致不精确的掩码进一步降低分割性能。针对这些挑战,我们提出DiffusePast,一种新颖的框架,其核心是基于扩散的生成式回放模块,该模块通过不同指令(如文本提示或边缘图)的引导,生成语义精确的图像和更可靠的掩码。具体而言,DiffusePast引入双生成器范式,专注于生成与下游数据集分布对齐的旧类图像,同时保留原始图像的结构和布局,从而实现更精确的掩码。为持续适应新增类别的视觉概念,我们在更新双生成器时融入类级词嵌入。此外,我们在新步骤图像的背景像素中分配充分的旧类伪标签,进一步缓解已学知识的遗忘。通过全面实验,我们的方法在主流基准测试中展现出竞争性性能,在旧类与新类的性能之间取得了更优的平衡。