Predicting the future motion of dynamic agents is of paramount importance to ensure safety or assess risks in motion planning for autonomous robots. In this paper, we propose a two-stage motion prediction method, referred to as R-Pred, that effectively utilizes both the scene and interaction context using a cascade of the initial trajectory proposal network and the trajectory refinement network. The initial trajectory proposal network produces M trajectory proposals corresponding to M modes of a future trajectory distribution. The trajectory refinement network enhances each of M proposals using 1) the tube-query scene attention (TQSA) and 2) the proposal-level interaction attention (PIA). TQSA uses tube-queries to aggregate the local scene context features pooled from proximity around the trajectory proposals of interest. PIA further enhances the trajectory proposals by modeling inter-agent interactions using a group of trajectory proposals selected based on their distances from neighboring agents. Our experiments conducted on the Argoverse and nuScenes datasets demonstrate that the proposed refinement network provides significant performance improvements compared to the single-stage baseline and that R-Pred achieves state-of-the-art performance in some categories of the benchmark.
翻译:预测动态智能体的未来运动对于确保自主机器人运动规划的安全性及风险评估至关重要。本文提出一种名为R-Pred的两阶段运动预测方法,该方法通过级联初始轨迹提议网络与轨迹细化网络,有效利用场景与交互上下文信息。初始轨迹提议网络生成对应未来轨迹分布M种模式的M条轨迹提议。轨迹细化网络通过以下两种机制增强每条轨迹提议:1)管查询场景注意力(TQSA)机制,利用管查询聚合目标轨迹提议邻近区域的局部场景上下文特征;2)提议级交互注意力(PIA)机制,通过基于邻域智能体距离选取的轨迹提议群组建模智能体间交互。在Argoverse和nuScenes数据集上的实验表明,与单阶段基线相比,所提出的细化网络带来显著的性能提升,且R-Pred在基准测试的若干类别中达到了最先进水平。