Predicting the future motion of dynamic agents is of paramount importance to ensuring safety and assessing risks in motion planning for autonomous robots. In this study, we propose a two-stage motion prediction method, called R-Pred, designed to effectively utilize both scene and interaction context using a cascade of the initial trajectory proposal and trajectory refinement networks. The initial trajectory proposal network produces M trajectory proposals corresponding to the M modes of the future trajectory distribution. The trajectory refinement network enhances each of the M proposals using 1) tube-query scene attention (TQSA) and 2) proposal-level interaction attention (PIA) mechanisms. TQSA uses tube-queries to aggregate local scene context features pooled from proximity around trajectory proposals of interest. PIA further enhances the trajectory proposals by modeling inter-agent interactions using a group of trajectory proposals selected by their distances from neighboring agents. Our experiments conducted on Argoverse and nuScenes datasets demonstrate that the proposed refinement network provides significant performance improvements compared to the single-stage baseline and that R-Pred achieves state-of-the-art performance in some categories of the benchmarks.
翻译:预测动态智能体的未来运动对于确保自主机器人运动规划的安全性和风险评估至关重要。本研究提出了一种名为R-Pred的两阶段运动预测方法,该方法通过级联初始轨迹提议网络和轨迹细化网络,有效利用场景上下文和交互上下文。初始轨迹提议网络生成对应未来轨迹分布M个模式的M个轨迹提议。轨迹细化网络通过以下两种机制增强每个提议:1)管状查询场景注意力(TQSA)和2)提议级交互注意力(PIA)。TQSA利用管状查询聚合从感兴趣轨迹提议邻近区域采样的局部场景上下文特征。PIA通过使用根据与邻近智能体距离选择的轨迹提议组建模智能体间交互,进一步增强轨迹提议。我们在Argoverse和nuScenes数据集上的实验表明,与单阶段基线相比,所提出的细化网络带来了显著的性能提升,且R-Pred在基准测试的某些类别中达到了最先进的性能。