Localization is a fundamental task in robotics for autonomous navigation. Existing localization methods rely on a single input data modality or train several computational models to process different modalities. This leads to stringent computational requirements and sub-optimal results that fail to capitalize on the complementary information in other data streams. This paper proposes UnLoc, a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions. Our multi-stream network can handle LiDAR, Camera and RADAR inputs for localization on demand, i.e., it can work with one or more input sensors, making it robust to sensor failure. UnLoc uses 3D sparse convolutions and cylindrical partitioning of the space to process LiDAR frames and implements ResNet blocks with a slot attention-based feature filtering module for the Radar and image modalities. We introduce a unique learnable modality encoding scheme to distinguish between the input sensor data. Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets. The results ascertain the efficacy of our technique.
翻译:定位是实现自主导航机器人中的一项基础任务。现有定位方法要么依赖单一输入数据模态,要么训练多个计算模型来处理不同模态。这导致严格的计算要求以及次优结果,无法充分利用其他数据流中的互补信息。本文提出UnLoc,一种新颖的统一神经建模方法,用于在所有天气条件下实现多传感器输入的定位。我们的多流网络可按需处理激光雷达、摄像头和雷达输入以实现定位,即它可结合一个或多个输入传感器工作,从而对传感器故障具有鲁棒性。UnLoc采用3D稀疏卷积和空间圆柱体分割来处理激光雷达帧,并实现基于槽注意力特征过滤模块的ResNet模块以处理雷达和图像模态。我们引入一种独特的可学习模态编码方案来区分输入传感器数据。我们的方法在Oxford Radar RobotCar、ApolloSouthBay和Perth-WA数据集上进行了广泛评估。结果验证了我们技术的有效性。