Training object detection models usually requires instance-level annotations, such as the positions and labels of all objects present in each image. Such supervision is unfortunately not always available and, more often, only image-level information is provided, also known as weak supervision. Recent works have addressed this limitation by leveraging knowledge from a richly annotated domain. However, the scope of weak supervision supported by these approaches has been very restrictive, preventing them to use all available information. In this work, we propose ProbKT, a framework based on probabilistic logical reasoning that allows to train object detection models with arbitrary types of weak supervision. We empirically show on different datasets that using all available information is beneficial as our ProbKT leads to significant improvement on target domain and better generalization compared to existing baselines. We also showcase the ability of our approach to handle complex logic statements as supervision signal.
翻译:训练目标检测模型通常需要实例级别的标注,例如每张图像中所有物体的位置和标签。然而,这种监督信息并非总是可用,更多情况下仅提供图像级别信息,即弱监督。最近的研究通过利用丰富标注领域的知识来解决这一局限性。然而,这些方法所支持的弱监督范围非常有限,导致无法充分利用所有可用信息。在本工作中,我们提出ProbKT——一个基于概率逻辑推理的框架,能够使用任意类型的弱监督来训练目标检测模型。我们在不同数据集上通过实验证明,充分利用所有可用信息是有益的:与现有基线方法相比,我们的ProbKT在目标域上取得了显著提升,并展现了更好的泛化能力。我们还展示了该方法处理复杂逻辑语句作为监督信号的能力。