Autonomous vehicles require motion forecasting of their surrounding multiagents (pedestrians and vehicles) to make optimal decisions for navigation. The existing methods focus on techniques to utilize the positions and velocities of these agents and fail to capture semantic information from the scene. Moreover, to mitigate the increase in computational complexity associated with the number of agents in the scene, some works leverage Euclidean distance to prune far-away agents. However, distance-based metric alone is insufficient to select relevant agents and accurately perform their predictions. To resolve these issues, we propose the Semantics-aware Interactive Multiagent Motion Forecasting (SIMMF) method to capture semantics along with spatial information and optimally select relevant agents for motion prediction. Specifically, we achieve this by implementing a semantic-aware selection of relevant agents from the scene and passing them through an attention mechanism to extract global encodings. These encodings along with agents' local information, are passed through an encoder to obtain time-dependent latent variables for a motion policy predicting the future trajectories. Our results show that the proposed approach outperforms state-of-the-art baselines and provides more accurate and scene-consistent predictions.
翻译:自动驾驶车辆需要对其周围多智能体(行人与车辆)的运动进行预测,以做出最优导航决策。现有方法主要利用智能体的位置和速度信息,未能捕捉场景中的语义信息。同时,为缓解因场景中智能体数量增加导致的计算复杂度增长,部分研究采用欧氏距离剔除远距离智能体。然而,仅依赖距离度量不足以筛选相关智能体并准确完成预测。为解决这些问题,我们提出语义感知交互式多智能体运动预测(SIMMF)方法,在融合空间信息的同时捕捉语义特征,并优化选择相关智能体进行运动预测。具体而言,我们通过实现语义感知的场景相关智能体筛选,并将其输入注意力机制以提取全局编码。这些编码与智能体局部信息共同通过编码器生成时间相关的潜在变量,进而形成预测未来轨迹的运动策略。实验结果表明,所提方法优于现有最先进基线模型,能提供更准确且场景一致的预测结果。