Rapid advancements in continual segmentation have yet to bridge the gap of scaling to large continually expanding vocabularies under compute-constrained scenarios. We discover that traditional continual training leads to catastrophic forgetting under compute constraints, unable to outperform zero-shot segmentation methods. We introduce a novel strategy for semantic and panoptic segmentation with zero forgetting, capable of adapting to continually growing vocabularies without the need for retraining or large memory costs. Our training-free approach, kNN-CLIP, leverages a database of instance embeddings to enable open-vocabulary segmentation approaches to continually expand their vocabulary on any given domain with a single-pass through data, while only storing embeddings minimizing both compute and memory costs. This method achieves state-of-the-art mIoU performance across large-vocabulary semantic and panoptic segmentation datasets. We hope kNN-CLIP represents a step forward in enabling more efficient and adaptable continual segmentation, paving the way for advances in real-world large-vocabulary continual segmentation methods.
翻译:持续分割技术的快速进展仍未解决在计算受限场景下扩展至持续增长的大词汇量分割的挑战。我们发现,传统持续训练在计算受限条件下会导致灾难性遗忘,其性能无法超越零样本分割方法。为此,我们提出一种面向语义分割与全景分割的新型策略,可实现零遗忘,且能适应持续增长的词汇量,无需重新训练或承担高昂内存成本。我们的免训练方法kNN-CLIP利用实例嵌入数据库,使开放词汇分割方法能够通过单次数据扫描,在任意给定领域持续扩展其词汇量,同时仅存储嵌入向量,从而最大程度降低计算与内存成本。该方法在大词汇量语义分割与全景分割数据集上均实现了最先进的平均交并比性能。我们期望kNN-CLIP能推动更高效、更适应性的持续分割发展,为现实世界大词汇量持续分割方法的进步奠定基础。