3D object Detection with LiDAR-camera encounters overfitting in algorithm development which is derived from the violation of some fundamental rules. We refer to the data annotation in dataset construction for theory complementing and argue that the regression task prediction should not involve the feature from the camera branch. By following the cutting-edge perspective of 'Detecting As Labeling', we propose a novel paradigm dubbed DAL. With the most classical elementary algorithms, a simple predicting pipeline is constructed by imitating the data annotation process. Then we train it in the simplest way to minimize its dependency and strengthen its portability. Though simple in construction and training, the proposed DAL paradigm not only substantially pushes the performance boundary but also provides a superior trade-off between speed and accuracy among all existing methods. With comprehensive superiority, DAL is an ideal baseline for both future work development and practical deployment. The code has been released to facilitate future work on https://github.com/HuangJunJie2017/BEVDet.
翻译:三维目标检测中激光雷达与相机的融合在算法开发中面临过拟合问题,这源于对某些基本规则的违背。我们借鉴数据集构建中的数据标注理论进行补充,认为回归任务预测不应涉及相机分支的特征。遵循前沿视角"检测即标注",我们提出名为DAL的新范式。通过最经典的初级算法,模仿数据标注过程构建了简洁的预测流程,并以最简方式训练以最小化其依赖性、增强可移植性。尽管构建与训练均极简,所提出的DAL范式不仅显著提升性能边界,更在所有现有方法中实现了速度与精度的优越权衡。凭借全面优势,DAL可作为未来工作开发与实战部署的理想基准。代码已开源至https://github.com/HuangJunJie2017/BEVDet以促进后续研究。