Domain adaptive semantic segmentation methods commonly utilize stage-wise training, consisting of a warm-up and a self-training stage. However, this popular approach still faces several challenges in each stage: for warm-up, the widely adopted adversarial training often results in limited performance gain, due to blind feature alignment; for self-training, finding proper categorical thresholds is very tricky. To alleviate these issues, we first propose to replace the adversarial training in the warm-up stage by a novel symmetric knowledge distillation module that only accesses the source domain data and makes the model domain generalizable. Surprisingly, this domain generalizable warm-up model brings substantial performance improvement, which can be further amplified via our proposed cross-domain mixture data augmentation technique. Then, for the self-training stage, we propose a threshold-free dynamic pseudo-label selection mechanism to ease the aforementioned threshold problem and make the model better adapted to the target domain. Extensive experiments demonstrate that our framework achieves remarkable and consistent improvements compared to the prior arts on popular benchmarks. Codes and models are available at https://github.com/fy-vision/DiGA
翻译:域自适应语义分割方法通常采用分阶段训练策略,包括预热阶段和自训练阶段。然而,这一流行方法在每个阶段仍面临若干挑战:在预热阶段,广泛采用的对抗性训练因盲目的特征对齐而导致性能提升有限;在自训练阶段,寻找合适的类别阈值极具挑战性。为缓解这些问题,我们首先提出用新颖的对称知识蒸馏模块替代预热阶段的对抗性训练,该模块仅访问源域数据即可使模型具备域泛化能力。令人惊讶的是,这种具有域泛化能力的预热模型带来了显著的性能提升,而通过我们提出的跨域混合数据增强技术,这种提升可进一步放大。随后,在自训练阶段,我们提出了一种无阈值的动态伪标签选择机制,以缓解前述阈值问题,使模型更好地适配目标域。大量实验表明,与现有方法相比,我们的框架在主流基准上取得了显著且一致的改进。代码和模型已开源至 https://github.com/fy-vision/DiGA。