面向三维物体检测的轻量化视角 (A Light Perspective for 3D Object Detection)

Comprehending the environment and accurately detecting objects in 3D space are essential for advancing autonomous vehicle technologies. Integrating Camera and LIDAR data has emerged as an effective approach for achieving high accuracy in 3D Object Detection models. However, existing methodologies often rely on heavy, traditional backbones that are computationally demanding. This paper introduces a novel approach that incorporates cutting-edge Deep Learning techniques into the feature extraction process, aiming to create more efficient models without compromising performance. Our model, NextBEV, surpasses established feature extractors like ResNet50 and MobileNetV2. On the KITTI 3D Monocular detection benchmark, NextBEV achieves an accuracy improvement of 2.39%, having less than 10% of the MobileNetV3 parameters. Moreover, we propose changes in LIDAR backbones that decreased the original inference time to 10 ms. Additionally, by fusing these lightweight proposals, we have enhanced the accuracy of the VoxelNet-based model by 2.93% and improved the F1-score of the PointPillar-based model by approximately 20%. Therefore, this work contributes to establishing lightweight and powerful models for individual or fusion techniques, making them more suitable for onboard implementations.

翻译：理解环境并精确检测三维空间中的物体对于推进自动驾驶技术至关重要。融合相机与激光雷达数据已成为实现高精度三维物体检测模型的有效途径。然而，现有方法通常依赖于计算需求较高的重型传统骨干网络。本文提出一种新颖方法，将前沿深度学习技术融入特征提取过程，旨在构建不牺牲性能的高效模型。我们的模型NextBEV超越了ResNet50和MobileNetV2等成熟特征提取器。在KITTI三维单目检测基准上，NextBEV以少于MobileNetV3参数数量10%的规模实现了2.39%的精度提升。此外，我们提出的激光雷达骨干网络改进方案将原始推理时间缩短至10毫秒。通过融合这些轻量化方案，我们使基于VoxelNet的模型精度提升2.93%，并将基于PointPillar模型的F1分数提高约20%。因此，本研究为独立或融合技术建立了轻量而强大的模型体系，使其更适用于车载部署场景。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日