Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources. Task-Incremental Learning (TIL) offers a privacy-preserving training paradigm using tasks arriving sequentially, instead of gathering them due to strict data sharing policies. However, the task evolution can span a wide scope that involves shifts in both image appearance and segmentation semantics with intricate correlation, causing concurrent appearance and semantic forgetting. To solve this issue, we propose a Comprehensive Generative Replay (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs to mimic past task data, which focuses on two aspects: modeling image-mask correspondence and promoting scalability for diverse tasks. Specifically, we introduce a novel Bayesian Joint Diffusion (BJD) model for high-quality synthesis of image-mask pairs with their correspondence explicitly preserved by conditional denoising. Furthermore, we develop a Task-Oriented Adapter (TOA) that recalibrates prompt embeddings to modulate the diffusion model, making the data synthesis compatible with different tasks. Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting. Code is available at https://github.com/jingyzhang/CGR.
翻译:通用分割模型在处理来自不同图像源的各种对象的多样化任务中日益受到青睐。任务增量学习(TIL)提供了一种隐私保护的训练范式,它利用按序到达的任务进行训练,而非因严格的数据共享政策而收集所有任务数据。然而,任务演变可能涉及广泛的范围,包括图像外观和分割语义的协同变化,这种变化具有复杂的相关性,会导致并发的外观遗忘和语义遗忘。为解决此问题,我们提出了一个全面生成回放(CGR)框架,该框架通过合成图像-掩码对来模拟过去任务的数据,从而恢复外观和语义知识,其重点在于两个方面:建模图像-掩码对应关系以及提升对多样化任务的可扩展性。具体而言,我们引入了一种新颖的贝叶斯联合扩散(BJD)模型,用于高质量合成图像-掩码对,并通过条件去噪显式地保持其对应关系。此外,我们开发了一个面向任务的适配器(TOA),它重新校准提示嵌入以调制扩散模型,使得数据合成与不同任务兼容。在增量任务(心脏、眼底和前列腺分割)上的实验表明,该方法在缓解并发外观与语义遗忘方面具有明显优势。代码可在 https://github.com/jingyzhang/CGR 获取。