This work addresses the challenging domain adaptation setting in which knowledge from the labelled source domain dataset is available only from the pretrained black-box segmentation model. The pretrained model's predictions for the target domain images are noisy because of the distributional differences between the source domain data and the target domain data. Since the model's predictions serve as pseudo labels during self-training, the noise in the predictions impose an upper bound on model performance. Therefore, we propose a simple yet novel image translation workflow, ReGEN, to address this problem. ReGEN comprises an image-to-image translation network and a segmentation network. Our workflow generates target-like images using the noisy predictions from the original target domain images. These target-like images are semantically consistent with the noisy model predictions and therefore can be used to train the segmentation network. In addition to being semantically consistent with the predictions from the original target domain images, the generated target-like images are also stylistically similar to the target domain images. This allows us to leverage the stylistic differences between the target-like images and the target domain image as an additional source of supervision while training the segmentation model. We evaluate our model with two benchmark domain adaptation settings and demonstrate that our approach performs favourably relative to recent state-of-the-art work. The source code will be made available.
翻译:本文针对具有挑战性的域自适应设置,即仅能从预训练的黑盒分割模型获取标记源域数据集的知识。由于源域数据与目标域数据之间的分布差异,预训练模型对目标域图像的预测存在噪声。由于在自训练过程中模型预测被用作伪标签,预测中的噪声对模型性能施加了上限。因此,我们提出一种简单而新颖的图像翻译工作流ReGEN来解决这一问题。ReGEN包含一个图像到图像的翻译网络和一个分割网络。我们的工作流利用原始目标域图像的噪声预测生成目标域风格的图像。这些目标域风格的图像与噪声模型预测在语义上一致,因此可用于训练分割网络。除了与原始目标域图像的预测保持语义一致外,生成的目标域风格图像在风格上也与目标域图像相似。这使我们能够在训练分割模型时,利用目标域风格图像与目标域图像之间的风格差异作为额外的监督信号。我们在两个基准域自适应设置下评估模型,并证明我们的方法相对于近期最先进的工作表现更优。源代码将公开提供。