kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies

Continual segmentation has not yet tackled the challenge of improving open-vocabulary segmentation models with training data for accurate segmentation across large, continually expanding vocabularies. We discover that traditional continual training results in severe catastrophic forgetting, failing to outperform a zero-shot segmentation baseline. We introduce a novel training-free strategy, kNN-CLIP, which augments the model with a database of instance embeddings for semantic and panoptic segmentation that achieves zero forgetting. We demonstrate that kNN-CLIP can adapt to continually growing vocabularies without the need for retraining or large memory costs. kNN-CLIP enables open-vocabulary segmentation methods to expand their vocabularies on any domain with a single pass through the data, while only storing compact embeddings. This approach minimizes both compute and memory costs. kNN-CLIP achieves state-of-the-art performance across large-vocabulary semantic and panoptic segmentation datasets. We hope kNN-CLIP represents a significant step forward in enabling more efficient and adaptable continual segmentation, paving the way for advances in real-world large-vocabulary continual segmentation methods.

翻译：持续分割尚未解决如何利用训练数据改进开放词汇分割模型，以实现在持续扩展的大规模词汇表上进行精确分割的挑战。我们发现，传统的持续训练会导致严重的灾难性遗忘，其效果甚至无法超越零样本分割基线。我们提出了一种新颖的免训练策略——kNN-CLIP，该方法通过为语义分割和全景分割构建一个实例嵌入数据库来增强模型，实现了零遗忘。我们证明，kNN-CLIP能够适应持续增长的词汇表，而无需重新训练或承担高昂的内存开销。kNN-CLIP使得开放词汇分割方法能够在任何领域上仅通过单次数据遍历即可扩展其词汇表，同时仅需存储紧凑的嵌入表示。这一方法最大限度地降低了计算与内存成本。kNN-CLIP在多个大规模词汇语义分割和全景分割数据集上取得了最先进的性能。我们希望kNN-CLIP能代表在实现更高效、更自适应的持续分割方面迈出的重要一步，为现实世界大规模词汇持续分割方法的发展铺平道路。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日