Semantic image segmentation is a critical component in many computer vision systems, such as autonomous driving. In such applications, adverse conditions (heavy rain, night time, snow, extreme lighting) on the one hand pose specific challenges, yet are typically underrepresented in the available datasets. Generating more training data is cumbersome and expensive, and the process itself is error-prone due to the inherent aleatoric uncertainty. To address this challenging problem, we propose BTSeg, which exploits image-level correspondences as weak supervision signal to learn a segmentation model that is agnostic to adverse conditions. To this end, our approach uses the Barlow twins loss from the field of unsupervised learning and treats images taken at the same location but under different adverse conditions as "augmentations" of the same unknown underlying base image. This allows the training of a segmentation model that is robust to appearance changes introduced by different adverse conditions. We evaluate our approach on ACDC and the new challenging ACG benchmark to demonstrate its robustness and generalization capabilities. Our approach performs favorably when compared to the current state-of-the-art methods, while also being simpler to implement and train. The code will be released upon acceptance.
翻译:语义图像分割是许多计算机视觉系统(如自动驾驶)中的关键组成部分。在此类应用中,恶劣条件(暴雨、夜间、积雪、极端光照)一方面会带来特定挑战,另一方面在现有数据集中通常代表性不足。生成更多训练数据既繁琐又昂贵,且由于固有的偶然不确定性,该过程本身容易出错。为解决这一难题,我们提出BTSeg方法,该方法利用图像级对应关系作为弱监督信号,学习一种对恶劣条件具有鲁棒性的分割模型。为此,我们的方法借鉴了无监督学习领域的巴洛双胞胎损失,并将同一地点但不同恶劣条件下拍摄的图像视为同一未知基础图像的“增广”。这使得我们能够训练出对恶劣条件引起的视觉变化具有鲁棒性的分割模型。我们在ACDC数据集和新的具有挑战性的ACG基准上评估了该方法,以证明其鲁棒性和泛化能力。与当前最先进方法相比,我们的方法在性能上表现优异,同时更易于实现和训练。代码将在论文接收后公开发布。