For a self-driving car to operate reliably, its perceptual system must generalize to the end-user's environment -- ideally without additional annotation efforts. One potential solution is to leverage unlabeled data (e.g., unlabeled LiDAR point clouds) collected from the end-users' environments (i.e. target domain) to adapt the system to the difference between training and testing environments. While extensive research has been done on such an unsupervised domain adaptation problem, one fundamental problem lingers: there is no reliable signal in the target domain to supervise the adaptation process. To overcome this issue we observe that it is easy to collect unsupervised data from multiple traversals of repeated routes. While different from conventional unsupervised domain adaptation, this assumption is extremely realistic since many drivers share the same roads. We show that this simple additional assumption is sufficient to obtain a potent signal that allows us to perform iterative self-training of 3D object detectors on the target domain. Concretely, we generate pseudo-labels with the out-of-domain detector but reduce false positives by removing detections of supposedly mobile objects that are persistent across traversals. Further, we reduce false negatives by encouraging predictions in regions that are not persistent. We experiment with our approach on two large-scale driving datasets and show remarkable improvement in 3D object detection of cars, pedestrians, and cyclists, bringing us a step closer to generalizable autonomous driving.
翻译:为使自动驾驶汽车可靠运行,其感知系统必须泛化至最终用户所处环境——理想情况下无需额外标注工作。一种潜在解决方案是利用从用户环境(即目标域)收集的无标注数据(如无标注激光雷达点云),使系统适应训练环境与测试环境之间的差异。尽管针对此类无监督域适应问题已有大量研究,但其根本问题依然存在:目标域中缺乏可靠信号来监督适应过程。为解决此问题,我们发现从重复路线多次遍历中获取无监督数据相当容易。尽管这不同于传统无监督域适应,但由于众多驾驶员共享相同道路,该假设极具现实性。研究表明,这一简单的补充假设足以提供强大信号,使我们能够在目标域中对三维目标检测器进行迭代自训练。具体而言,我们使用域外检测器生成伪标签,但通过移除跨遍历场景中持续的疑似移动目标检测结果来减少假阳性。此外,通过鼓励在非持续区域进行预测来减少假阴性。我们在两个大规模驾驶数据集上验证了该方法,并在汽车、行人和骑行者的三维目标检测中取得了显著提升,向可泛化的自动驾驶迈进一步。