TLCFuse：面向遮挡感知语义分割辅助运动规划的时序多模态融合方法 (TLCFuse: Temporal Multi-Modality Fusion Towards Occlusion-Aware Semantic Segmentation-Aided Motion Planning)

In autonomous driving, addressing occlusion scenarios is crucial yet challenging. Robust surrounding perception is essential for handling occlusions and aiding motion planning. State-of-the-art models fuse Lidar and Camera data to produce impressive perception results, but detecting occluded objects remains challenging. In this paper, we emphasize the crucial role of temporal cues by integrating them alongside these modalities to address this challenge. We propose a novel approach for bird's eye view semantic grid segmentation, that leverages sequential sensor data to achieve robustness against occlusions. Our model extracts information from the sensor readings using attention operations and aggregates this information into a lower-dimensional latent representation, enabling thus the processing of multi-step inputs at each prediction step. Moreover, we show how it can also be directly applied to forecast the development of traffic scenes and be seamlessly integrated into a motion planner for trajectory planning. On the semantic segmentation tasks, we evaluate our model on the nuScenes dataset and show that it outperforms other baselines, with particularly large differences when evaluating on occluded and partially-occluded vehicles. Additionally, on motion planning task we are among the early teams to train and evaluate on nuPlan, a cutting-edge large-scale dataset for motion planning.

翻译：在自动驾驶中，处理遮挡场景至关重要且极具挑战性。鲁棒的周围环境感知对于处理遮挡和辅助运动规划必不可少。现有先进模型融合激光雷达与摄像头数据以产生出色的感知结果，但检测被遮挡物体仍然困难。本文通过将时序线索与这些模态相结合来应对这一挑战，强调了时序线索的关键作用。我们提出了一种新颖的鸟瞰图语义网格分割方法，该方法利用序列传感器数据以实现对遮挡的鲁棒性。我们的模型通过注意力操作从传感器读数中提取信息，并将这些信息聚合到低维潜在表示中，从而能够在每个预测步骤处理多步输入。此外，我们还展示了该方法如何直接应用于预测交通场景的发展，并无缝集成到运动规划器中进行轨迹规划。在语义分割任务上，我们在 nuScenes 数据集上评估了模型，结果表明其性能优于其他基线方法，尤其是在评估被遮挡和部分遮挡车辆时表现出显著优势。在运动规划任务上，我们是首批在 nuPlan（一个用于运动规划的前沿大规模数据集）上进行训练和评估的团队之一。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日