LiDAR and Radar are two complementary sensing approaches in that LiDAR specializes in capturing an object's 3D shape while Radar provides longer detection ranges as well as velocity hints. Though seemingly natural, how to efficiently combine them for improved feature representation is still unclear. The main challenge arises from that Radar data are extremely sparse and lack height information. Therefore, directly integrating Radar features into LiDAR-centric detection networks is not optimal. In this work, we introduce a bi-directional LiDAR-Radar fusion framework, termed Bi-LRFusion, to tackle the challenges and improve 3D detection for dynamic objects. Technically, Bi-LRFusion involves two steps: first, it enriches Radar's local features by learning important details from the LiDAR branch to alleviate the problems caused by the absence of height information and extreme sparsity; second, it combines LiDAR features with the enhanced Radar features in a unified bird's-eye-view representation. We conduct extensive experiments on nuScenes and ORR datasets, and show that our Bi-LRFusion achieves state-of-the-art performance for detecting dynamic objects. Notably, Radar data in these two datasets have different formats, which demonstrates the generalizability of our method. Codes are available at https://github.com/JessieW0806/BiLRFusion.
翻译:激光雷达与雷达是两种互补的感知方式:激光雷达擅长捕捉物体的三维形状,而雷达则具备更远的探测距离和速度信息。尽管看似自然,如何有效结合两者以提升特征表示能力仍不明确。主要挑战在于雷达数据极度稀疏且缺乏高度信息。因此,将雷达特征直接整合到以激光雷达为中心的检测网络并非最优方案。本文提出一种名为Bi-LRFusion的双向激光雷达-雷达融合框架,旨在攻克上述挑战并改进动态目标的3D检测。在技术层面,Bi-LRFusion包含两个步骤:首先,通过学习激光雷达分支的重要细节来丰富雷达的局部特征,以缓解由高度信息缺失和极度稀疏性引发的问题;其次,在统一的鸟瞰图表示中将激光雷达特征与增强后的雷达特征相结合。我们在nuScenes和ORR数据集上进行了广泛实验,结果表明Bi-LRFusion在检测动态目标时达到了当前最优性能。值得注意的是,这两个数据集中的雷达数据格式不同,这验证了我们方法的泛化能力。代码已开源:https://github.com/JessieW0806/BiLRFusion。