Predicting the trajectories of surrounding agents is still considered one of the most challenging tasks for autonomous driving. In this paper, we introduce a multi-modal trajectory prediction framework based on the transformer network. The semantic maps of each agent are used as inputs to convolutional networks to automatically derive relevant contextual information. A novel auxiliary loss that penalizes unfeasible off-road predictions is also proposed in this study. Experiments on the Lyft l5kit dataset show that the proposed model achieves state-of-the-art performance, substantially improving the accuracy and feasibility of the prediction outcomes.
翻译:环境主体轨迹预测仍被认为是自动驾驶中最具挑战性的任务之一。本文提出了一种基于Transformer网络的多模态轨迹预测框架。各环境主体的语义地图被输入卷积网络,以自动提取相关的上下文信息。本研究还提出了一种新型辅助损失函数,用于惩罚不可行的越野预测。在Lyft l5kit数据集上的实验表明,所提模型达到了最先进的性能,显著提升了预测结果的准确性与可行性。