Predicting the future motion of dynamic agents is of paramount importance to ensuring safety and assessing risks in motion planning for autonomous robots. In this study, we propose a two-stage motion prediction method, called R-Pred, designed to effectively utilize both scene and interaction context using a cascade of the initial trajectory proposal and trajectory refinement networks. The initial trajectory proposal network produces M trajectory proposals corresponding to the M modes of the future trajectory distribution. The trajectory refinement network enhances each of the M proposals using 1) tube-query scene attention (TQSA) and 2) proposal-level interaction attention (PIA) mechanisms. TQSA uses tube-queries to aggregate local scene context features pooled from proximity around trajectory proposals of interest. PIA further enhances the trajectory proposals by modeling inter-agent interactions using a group of trajectory proposals selected by their distances from neighboring agents. Our experiments conducted on Argoverse and nuScenes datasets demonstrate that the proposed refinement network provides significant performance improvements compared to the single-stage baseline and that R-Pred achieves state-of-the-art performance in some categories of the benchmarks.
翻译:摘要:预测动态智能体的未来运动对于确保自动驾驶机器人运动规划的安全性及风险评估至关重要。本研究提出一种名为R-Pred的两阶段运动预测方法,该方法通过级联初始轨迹提议网络与轨迹优化网络,有效利用场景与交互上下文信息。初始轨迹提议网络生成M条对应未来轨迹分布M种模式的轨迹提议。轨迹优化网络通过以下两种机制增强每条提议:1)管查询场景注意力(TQSA)与2)提议级交互注意力(PIA)。TQSA利用管查询聚合从兴趣轨迹提议邻近区域采样的局部场景上下文特征;PIA则通过根据与相邻智能体距离选择的轨迹提议组建模智能体间交互,进一步优化轨迹提议。在Argoverse与nuScenes数据集上的实验表明,相比单阶段基线方法,所提优化网络能显著提升性能,且R-Pred在基准测试的部分类别中达到了最先进水平。