Open-Vocabulary Segmentation (OVS) aims to segment classes that are not present in the training dataset. However, most existing studies assume that the training data is fixed in advance, overlooking more practical scenarios where new datasets are continuously collected over time. To address this, we first analyze how existing OVS models perform under such conditions. In this context, we explore several approaches such as retraining, fine-tuning, and continual learning but find that each of them has clear limitations. To address these issues, we propose ConOVS, a novel continual learning method based on a Mixture-of-Experts framework. ConOVS dynamically combines expert decoders based on the probability that an input sample belongs to the distribution of each incremental dataset. Through extensive experiments, we show that ConOVS consistently outperforms existing methods across pre-training, incremental, and zero-shot test datasets, effectively expanding the recognition capabilities of OVS models when data is collected sequentially.
翻译:开放词汇分割(OVS)旨在分割训练数据集中未出现的类别。然而,现有研究大多假设训练数据是预先固定的,忽视了随时间推移不断收集新数据集这一更具实际意义的场景。为此,我们首先分析了现有OVS模型在此类条件下的表现。在此背景下,我们探索了重新训练、微调和持续学习等多种方法,但发现每种方法均存在明显局限。为解决这些问题,我们提出了ConOVS——一种基于专家混合框架的新型持续学习方法。ConOVS根据输入样本属于各增量数据集分布的概率,动态组合专家解码器。通过大量实验,我们证明ConOVS在预训练、增量及零样本测试数据集上均持续优于现有方法,有效拓展了数据顺序收集时OVS模型的识别能力。