There is a longstanding interest in capturing the error behaviour of object detectors by finding images where their performance is likely to be unsatisfactory. In real-world applications such as autonomous driving, it is also crucial to characterise potential failures beyond simple requirements of detection performance. For example, a missed detection of a pedestrian close to an ego vehicle will generally require closer inspection than a missed detection of a car in the distance. The problem of predicting such potential failures at test time has largely been overlooked in the literature and conventional approaches based on detection uncertainty fall short in that they are agnostic to such fine-grained characterisation of errors. In this work, we propose to reformulate the problem of finding "hard" images as a query-based hard image retrieval task, where queries are specific definitions of "hardness", and offer a simple and intuitive method that can solve this task for a large family of queries. Our method is entirely post-hoc, does not require ground-truth annotations, is independent of the choice of a detector, and relies on an efficient Monte Carlo estimation that uses a simple stochastic model in place of the ground-truth. We show experimentally that it can be applied successfully to a wide variety of queries for which it can reliably identify hard images for a given detector without any labelled data. We provide results on ranking and classification tasks using the widely used RetinaNet, Faster-RCNN, Mask-RCNN, and Cascade Mask-RCNN object detectors. The code for this project is available at https://github.com/fiveai/hardest.
翻译:长期以来,通过检索检测性能可能不达标的图像来捕捉目标检测器的错误行为一直备受关注。在自动驾驶等实际应用中,除基本的检测性能要求外,对潜在故障进行表征同样至关重要。例如,漏检自车附近的行人通常需要比漏检远处车辆进行更细致的审查。然而,在测试阶段预测此类潜在故障的问题在文献中很大程度上被忽视,且基于检测不确定性的传统方法因无法区分这种细粒度错误表征而存在不足。本文提出将"困难"图像检索问题重新定义为基于查询的困难图像检索任务(其中查询是"困难度"的具体定义),并提供一种简单直观的方法来解决这类查询家族中的问题。本方法完全属于事后处理,无需真实标注,独立于检测器的选择,并利用高效蒙特卡洛估计法,采用简单随机模型替代真实标注。实验证明,该方法可成功应用于多种查询类型,无需任何标注数据即可为给定检测器可靠识别困难图像。我们基于广泛使用的RetinaNet、Faster-RCNN、Mask-RCNN和Cascade Mask-RCNN目标检测器,提供了排序和分类任务的实验结果。该项目代码详见https://github.com/fiveai/hardest。