In unsupervised domain adaptation (UDA), a model trained on source data (e.g. synthetic) is adapted to target data (e.g. real-world) without access to target annotation. Most previous UDA methods struggle with classes that have a similar visual appearance on the target domain as no ground truth is available to learn the slight appearance differences. To address this problem, we propose a Masked Image Consistency (MIC) module to enhance UDA by learning spatial context relations of the target domain as additional clues for robust visual recognition. MIC enforces the consistency between predictions of masked target images, where random patches are withheld, and pseudo-labels that are generated based on the complete image by an exponential moving average teacher. To minimize the consistency loss, the network has to learn to infer the predictions of the masked regions from their context. Due to its simple and universal concept, MIC can be integrated into various UDA methods across different visual recognition tasks such as image classification, semantic segmentation, and object detection. MIC significantly improves the state-of-the-art performance across the different recognition tasks for synthetic-to-real, day-to-nighttime, and clear-to-adverse-weather UDA. For instance, MIC achieves an unprecedented UDA performance of 75.9 mIoU and 92.8% on GTA-to-Cityscapes and VisDA-2017, respectively, which corresponds to an improvement of +2.1 and +3.0 percent points over the previous state of the art. The implementation is available at https://github.com/lhoyer/MIC.
翻译:摘要:在无监督域自适应(UDA)中,模型在源数据(例如合成数据)上训练,并适应目标数据(例如真实世界数据),而无需访问目标标注。以往的多数UDA方法在目标域中处理视觉外观相似的类别时存在困难,因为缺乏真实标注来学习细微的外观差异。为解决此问题,我们提出了一种掩码图像一致性(MIC)模块,通过学习目标域的空间上下文关系作为鲁棒视觉识别的额外线索来增强UDA。MIC强制要求掩码目标图像(随机遮挡部分图像块)的预测结果与基于完整图像由指数移动平均教师生成的伪标签之间保持一致性。为最小化一致性损失,网络必须学会从上下文推断掩码区域的预测结果。由于MIC概念简单且具有普适性,可集成至多种UDA方法中,适用于图像分类、语义分割及目标检测等不同视觉识别任务。MIC显著提升了合成到真实、白天到夜间、晴天到恶劣天气等UDA场景下不同识别任务的最佳性能。例如,在GTA→Cityscapes和VisDA-2017上,MIC分别实现了75.9 mIoU和92.8%的前所未有的UDA性能,较之前最优结果分别提升了2.1和3.0个百分点。代码实现见https://github.com/lhoyer/MIC。