Object detectors often experience a drop in performance when new environmental conditions are insufficiently represented in the training data. This paper studies how to automatically fine-tune a pre-existing object detector while exploring and acquiring images in a new environment without relying on human intervention, i.e., in an utterly self-supervised fashion. In our setting, an agent initially learns to explore the environment using a pre-trained off-the-shelf detector to locate objects and associate pseudo-labels. By assuming that pseudo-labels for the same object must be consistent across different views, we learn an exploration policy mining hard samples and we devise a novel mechanism for producing refined predictions from the consensus among observations. Our approach outperforms the current state-of-the-art, and it closes the performance gap against a fully supervised setting without relying on ground-truth annotations. We also compare various exploration policies for the agent to gather more informative observations. Code and dataset will be made available upon paper acceptance
翻译:目标检测器在训练数据未能充分覆盖新环境条件时,其性能往往会下降。本文研究如何在无人工干预的条件下——即完全以自监督方式——自动微调预训练目标检测器,同时在新环境中探索并采集图像。在我们的设定中,智能体首先利用预训练的现成检测器学习探索环境,以定位目标并关联伪标签。通过假设同一目标的伪标签在不同视角下必须保持一致,我们学习一种挖掘困难样本的探索策略,并设计了一种新颖机制,从观测结果的共识中生成精炼预测。我们的方法超越了当前最先进水平,且在无需依赖真实标注的情况下,缩小了与全监督设置之间的性能差距。我们还比较了多种探索策略,以帮助智能体收集更具信息量的观测数据。代码和数据集将在论文被接收后公开。