Domain Generalized Semantic Segmentation (DGSS) deals with training a model on a labeled source domain with the aim of generalizing to unseen domains during inference. Existing DGSS methods typically effectuate robust features by means of Domain Randomization (DR). Such an approach is often limited as it can only account for style diversification and not content. In this work, we take an orthogonal approach to DGSS and propose to use an assembly of CoLlaborative FOUndation models for Domain Generalized Semantic Segmentation (CLOUDS). In detail, CLOUDS is a framework that integrates FMs of various kinds: (i) CLIP backbone for its robust feature representation, (ii) generative models to diversify the content, thereby covering various modes of the possible target distribution, and (iii) Segment Anything Model (SAM) for iteratively refining the predictions of the segmentation model. Extensive experiments show that our CLOUDS excels in adapting from synthetic to real DGSS benchmarks and under varying weather conditions, notably outperforming prior methods by 5.6% and 6.7% on averaged miou, respectively. The code is available at : https://github.com/yasserben/CLOUDS
翻译:域泛化语义分割(DGSS)旨在利用带标签的源域训练模型,使其在推理阶段泛化至未见过的目标域。现有DGSS方法通常通过域随机化(DR)实现鲁棒特征学习,但此类方法存在局限性,仅能处理风格多样化而无法应对内容变化。本研究提出一种正交策略,通过构建协作基础模型集成框架(CLOUDS)实现域泛化语义分割。具体而言,CLOUDS整合了多种类型的基础模型:(i)利用CLIP骨干网络提取鲁棒特征表示;(ii)采用生成模型增强内容多样性,从而覆盖目标分布的多种模态;(iii)引入Segment Anything Model(SAM)迭代优化分割模型的预测结果。大量实验表明,CLOUDS在合成到真实场景DGSS基准测试及不同天气条件下的域适应任务中表现优异,平均交并比分别较此前方法提升5.6%和6.7%。代码已开源:https://github.com/yasserben/CLOUDS