Deep neural network based object detectors are continuously evolving and are used in a multitude of applications, each having its own set of requirements. While safety-critical applications need high accuracy and reliability, low-latency tasks need resource and energy-efficient networks. Real-time detectors, which are a necessity in high-impact real-world applications, are continuously proposed, but they overemphasize the improvements in accuracy and speed while other capabilities such as versatility, robustness, resource and energy efficiency are omitted. A reference benchmark for existing networks does not exist, nor does a standard evaluation guideline for designing new networks, which results in ambiguous and inconsistent comparisons. We, thus, conduct a comprehensive study on multiple real-time detectors (anchor-, keypoint-, and transformer-based) on a wide range of datasets and report results on an extensive set of metrics. We also study the impact of variables such as image size, anchor dimensions, confidence thresholds, and architecture layers on the overall performance. We analyze the robustness of detection networks against distribution shifts, natural corruptions, and adversarial attacks. Also, we provide a calibration analysis to gauge the reliability of the predictions. Finally, to highlight the real-world impact, we conduct two unique case studies, on autonomous driving and healthcare applications. To further gauge the capability of networks in critical real-time applications, we report the performance after deploying the detection networks on edge devices. Our extensive empirical study can act as a guideline for the industrial community to make an informed choice on the existing networks. We also hope to inspire the research community towards a new direction in the design and evaluation of networks that focuses on a bigger and holistic overview for a far-reaching impact.
翻译:基于深度神经网络的目标检测器持续演进,并广泛应用于各类具有特定需求的任务场景。安全关键型应用需要高精度与高可靠性,而低延迟任务则要求资源与能效优化的网络架构。实时检测器作为高影响力实际应用场景中的刚需,虽被不断提出,但过度强调精度与速度的提升,忽视了通用性、鲁棒性、资源与能效等关键能力。当前既缺乏现有网络的基准参考,也不存在设计新网络的标准评估准则,导致比较结果模糊且不一致。为此,我们对多种实时检测器(基于锚点、关键点与Transformer)在广泛数据集上开展综合研究,并报告多维度评估指标。同时探究图像尺寸、锚点维度、置信度阈值及网络架构层数等变量对整体性能的影响。我们分析了检测网络在分布偏移、自然损坏与对抗攻击下的鲁棒性,并通过校准分析评估预测可靠性。最后为突显实际影响,针对自动驾驶与医疗健康应用开展两项典型案例研究。通过部署检测网络至边缘设备,进一步衡量关键实时应用中网络的性能表现。本实证研究可为工业界在现有网络选型时提供决策参考,亦期望启发研究社区转向更宏大、系统的网络设计与评估新方向,以产生深远影响。