Object detectors often experience a drop in performance when new environmental conditions are insufficiently represented in the training data. This paper studies how to automatically fine-tune a pre-existing object detector while exploring and acquiring images in a new environment without relying on human intervention, i.e., in an utterly self-supervised fashion. In our setting, an agent initially learns to explore the environment using a pre-trained off-the-shelf detector to locate objects and associate pseudo-labels. By assuming that pseudo-labels for the same object must be consistent across different views, we learn an exploration policy mining hard samples and we devise a novel mechanism for producing refined predictions from the consensus among observations. Our approach outperforms the current state-of-the-art, and it closes the performance gap against a fully supervised setting without relying on ground-truth annotations. We also compare various exploration policies for the agent to gather more informative observations. Code and dataset will be made available upon paper acceptance
翻译:物体检测器在训练数据中未能充分体现新环境条件时,性能往往会下降。本文研究如何在不依赖人工干预的情况下,即完全以自监督方式,自动微调现有物体检测器,同时在新环境中探索并采集图像。在我们的设置中,智能体首先利用预训练的现成检测器学习探索环境,以定位物体并关联伪标签。通过假设同一物体的伪标签在不同视角下必须保持一致,我们学习了一种挖掘难样本的探索策略,并设计了一种新机制,从观测结果的一致性中生成精炼预测。我们的方法超越了当前最先进水平,且在不依赖真实标注的情况下缩小了与全监督设置之间的性能差距。我们还比较了多种探索策略,以使智能体收集更具信息量的观测结果。代码与数据集将在论文被接收后公开。