The lack of out-of-domain generalization is a critical weakness of deep networks for semantic segmentation. Previous studies relied on the assumption of a static model, i. e., once the training process is complete, model parameters remain fixed at test time. In this work, we challenge this premise with a self-adaptive approach for semantic segmentation that adjusts the inference process to each input sample. Self-adaptation operates on two levels. First, it fine-tunes the parameters of convolutional layers to the input image using consistency regularization. Second, in Batch Normalization layers, self-adaptation interpolates between the training and the reference distribution derived from a single test sample. Despite both techniques being well known in the literature, their combination sets new state-of-the-art accuracy on synthetic-to-real generalization benchmarks. Our empirical study suggests that self-adaptation may complement the established practice of model regularization at training time for improving deep network generalization to out-of-domain data. Our code and pre-trained models are available at https://github.com/visinf/self-adaptive.
翻译:域外泛化能力不足是深度网络在语义分割中的关键弱点。以往研究依赖于静态模型假设,即训练完成后模型参数在测试时保持不变。本文对此前提提出挑战,提出一种针对语义分割的自适应方法,该方法根据每个输入样本调整推理过程。自适应机制在两个层面运作:第一,通过一致性正则化对卷积层参数进行针对输入图像的微调;第二,在批归一化层中,自适应方法在训练分布与从单个测试样本推导出的参考分布之间进行插值。尽管这两种技术已在文献中广为人知,但其结合在合成到真实场景泛化基准测试中创下了新的最高准确率。我们的实验研究表明,自适应方法可补充训练时模型正则化的传统实践,以提升深度网络对域外数据的泛化能力。代码与预训练模型已开源至 https://github.com/visinf/self-adaptive。