Panoptic segmentation is an important computer vision task which combines semantic and instance segmentation. It plays a crucial role in domains of medical image analysis, self-driving vehicles, and robotics by providing a comprehensive understanding of visual environments. Traditionally, deep learning panoptic segmentation models have relied on dense and accurately annotated training data, which is expensive and time consuming to obtain. Recent advancements in self-supervised learning approaches have shown great potential in leveraging synthetic and unlabelled data to generate pseudo-labels using self-training to improve the performance of instance and semantic segmentation models. The three available methods for self-supervised panoptic segmentation use proposal-based transformer architectures which are computationally expensive, complicated and engineered for specific tasks. The aim of this work is to develop a framework to perform embedding-based self-supervised panoptic segmentation using self-training in a synthetic-to-real domain adaptation problem setting.
翻译:全景分割是一项重要的计算机视觉任务,它结合了语义分割与实例分割。通过提供对视觉环境的全面理解,该任务在医学图像分析、自动驾驶车辆和机器人等领域中发挥着关键作用。传统上,深度学习全景分割模型依赖于密集且精确标注的训练数据,而这类数据的获取成本高昂且耗时。近期自监督学习方法的研究进展表明,利用自训练技术从合成数据和无标注数据中生成伪标签,能够显著提升实例分割与语义分割模型的性能。目前可用的三种自监督全景分割方法均采用了基于提议的Transformer架构,这类架构计算开销大、结构复杂且专为特定任务设计。本研究旨在开发一个框架,在合成到真实的域适应问题设置中,通过自训练实现基于嵌入的自监督全景分割。