Considerable research effort has been devoted to LiDAR-based 3D object detection and empirical performance has been significantly improved. While progress has been encouraging, we observe an overlooked issue: it is not yet common practice to compare different 3D detectors under the same cost, e.g., inference latency. This makes it difficult to quantify the true performance gain brought by recently proposed architecture designs. The goal of this work is to conduct a cost-aware evaluation of LiDAR-based 3D object detectors. Specifically, we focus on SECOND, a simple grid-based one-stage detector, and analyze its performance under different costs by scaling its original architecture. Then we compare the family of scaled SECOND with recent 3D detection methods, such as Voxel R-CNN and PV-RCNN++. The results are surprising. We find that, if allowed to use the same latency, SECOND can match the performance of PV-RCNN++, the current state-of-the-art method on the Waymo Open Dataset. Scaled SECOND also easily outperforms many recent 3D detection methods published during the past year. We recommend future research control the inference cost in their empirical comparison and include the family of scaled SECOND as a strong baseline when presenting novel 3D detection methods.
翻译:大量的研究工作致力于基于LiDAR的三维目标检测,实证性能已得到显著提升。尽管进展令人鼓舞,但我们发现一个被忽视的问题:目前尚未普遍在不同3D检测器间进行同等成本(如推理延迟)下的比较。这导致难以量化近期提出的架构设计所带来的真实性能增益。本文旨在对基于LiDAR的三维目标检测器进行成本感知评估。具体而言,我们聚焦于SECOND——一种简单的基于网格的单阶段检测器,通过缩放其原始架构来分析不同成本下的性能表现。随后我们将缩放后的SECOND家族与Voxel R-CNN、PV-RCNN++等近年3D检测方法进行比较。结果令人惊讶:若允许使用相同延迟,SECOND可匹配当前Waymo开放数据集上最先进方法PV-RCNN++的性能。缩放后的SECOND同样轻松超越了近一年发表的许多新型3D检测方法。我们建议未来研究在实证比较中控制推理成本,并在提出新型3D检测方法时将缩放后的SECOND家族作为强基线。