Active learning presents a promising avenue for training high-performance models with minimal labeled data, achieved by judiciously selecting the most informative instances to label and incorporating them into the task learner. Despite notable advancements in active learning for image recognition, metrics devised or learned to gauge the information gain of data, crucial for query strategy design, do not consistently align with task model performance metrics, such as Mean Average Precision (MeanAP) in object detection tasks. This paper introduces MeanAP-Guided Reinforced Active Learning for Object Detection (MAGRAL), a novel approach that directly utilizes the MeanAP metric of the task model to devise a sampling strategy employing a reinforcement learning-based sampling agent. Built upon LSTM architecture, the agent efficiently explores and selects subsequent training instances, and optimizes the process through policy gradient with MeanAP serving as reward. Recognizing the time-intensive nature of MeanAP computation at each step, we propose fast look-up tables to expedite agent training. We assess MAGRAL's efficacy across popular benchmarks, PASCAL VOC and MS COCO, utilizing different backbone architectures. Empirical findings substantiate MAGRAL's superiority over recent state-of-the-art methods, showcasing substantial performance gains. MAGRAL establishes a robust baseline for reinforced active object detection, signifying its potential in advancing the field.
翻译:主动学习通过精心选择最具信息量的样本进行标注并将其纳入任务学习器,为用最少标注数据训练高性能模型提供了一条有前景的路径。尽管图像识别领域的主动学习取得了显著进展,但用于衡量数据信息增益(这对查询策略设计至关重要)的指标或学习方法,与任务模型性能指标(如目标检测任务中的平均精密度MeanAP)并不总是一致。本文提出面向目标检测的MeanAP引导强化主动学习(MAGRAL),这是一种直接利用任务模型MeanAP指标设计采样策略的新方法,该方法采用基于强化学习的采样智能体。该智能体基于LSTM架构,能够高效探索并选择后续训练样本,通过以MeanAP作为奖励的策略梯度来优化过程。针对每步MeanAP计算耗时的问题,我们提出快速查找表以加速智能体训练。我们在PASCAL VOC和MS COCO等主流基准上,采用不同骨干网络架构评估了MAGRAL的效果。实验结果证实,MAGRAL的性能优于近期最先进方法,展现出显著性能提升。MAGRAL为强化主动目标检测建立了稳健的基线,彰显了其在推动该领域发展方面的潜力。