Simulation data can be accurately labeled and have been expected to improve the performance of data-driven algorithms, including object detection. However, due to the various domain inconsistencies from simulation to reality (sim-to-real), cross-domain object detection algorithms usually suffer from dramatic performance drops. While numerous unsupervised domain adaptation (UDA) methods have been developed to address cross-domain tasks between real-world datasets, progress in sim-to-real remains limited. This paper presents a novel Complex-to-Simple (CTS) framework to transfer models from labeled simulation (source) to unlabeled reality (target) domains. Based on a two-stage detector, the novelty of this work is threefold: 1) developing fixed-size anchor heads and RoI augmentation to address size bias and feature diversity between two domains, thereby improving the quality of pseudo-label; 2) developing a novel corner-format representation of aleatoric uncertainty (AU) for the bounding box, to uniformly quantify pseudo-label quality; 3) developing a noise-aware mean teacher domain adaptation method based on AU, as well as object-level and frame-level sampling strategies, to migrate the impact of noisy labels. Experimental results demonstrate that our proposed approach significantly enhances the sim-to-real domain adaptation capability of 3D object detection models, outperforming state-of-the-art cross-domain algorithms, which are usually developed for real-to-real UDA tasks.
翻译:仿真数据能够获得精确标注,因而有望提升包括目标检测在内的数据驱动算法的性能。然而,由于从仿真到真实环境存在多种域不一致性,跨域目标检测算法通常会出现性能急剧下降的问题。尽管已有大量无监督域自适应方法被提出以解决真实数据集间的跨域任务,但在仿真到真实场景的进展仍然有限。本文提出了一种新颖的由复杂到简单的框架,用于将模型从带标注的仿真域迁移至无标注的真实域。基于两阶段检测器,本工作的创新点主要体现在三个方面:1)开发固定尺寸锚点头与感兴趣区域增强方法,以解决两域间的尺寸偏差与特征多样性问题,从而提升伪标签质量;2)提出一种新颖的边界框偶然不确定性角点格式表示法,以统一量化伪标签质量;3)基于偶然不确定性开发噪声感知均值教师域自适应方法,并结合对象级与帧级采样策略,以减轻噪声标签的影响。实验结果表明,所提方法显著增强了三维目标检测模型在仿真到真实场景的域自适应能力,其性能优于当前最先进的跨域算法,而这些算法通常是为真实到真实的无监督域自适应任务所设计。