kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies

Rapid advancements in continual segmentation have yet to bridge the gap of scaling to large continually expanding vocabularies under compute-constrained scenarios. We discover that traditional continual training leads to catastrophic forgetting under compute constraints, unable to outperform zero-shot segmentation methods. We introduce a novel strategy for semantic and panoptic segmentation with zero forgetting, capable of adapting to continually growing vocabularies without the need for retraining or large memory costs. Our training-free approach, kNN-CLIP, leverages a database of instance embeddings to enable open-vocabulary segmentation approaches to continually expand their vocabulary on any given domain with a single-pass through data, while only storing embeddings minimizing both compute and memory costs. This method achieves state-of-the-art mIoU performance across large-vocabulary semantic and panoptic segmentation datasets. We hope kNN-CLIP represents a step forward in enabling more efficient and adaptable continual segmentation, paving the way for advances in real-world large-vocabulary continual segmentation methods.

翻译：持续分割技术的快速进展仍未解决在计算受限场景下扩展至持续增长的大词汇量分割的挑战。我们发现，传统持续训练在计算受限条件下会导致灾难性遗忘，其性能无法超越零样本分割方法。为此，我们提出一种面向语义分割与全景分割的新型策略，可实现零遗忘，且能适应持续增长的词汇量，无需重新训练或承担高昂内存成本。我们的免训练方法kNN-CLIP利用实例嵌入数据库，使开放词汇分割方法能够通过单次数据扫描，在任意给定领域持续扩展其词汇量，同时仅存储嵌入向量，从而最大程度降低计算与内存成本。该方法在大词汇量语义分割与全景分割数据集上均实现了最先进的平均交并比性能。我们期望kNN-CLIP能推动更高效、更适应性的持续分割发展，为现实世界大词汇量持续分割方法的进步奠定基础。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日