Multi-agent motion prediction is a crucial concern in autonomous driving, yet it remains a challenge owing to the ambiguous intentions of dynamic agents and their intricate interactions. Existing studies have attempted to capture interactions between road entities by using the definite data in history timesteps, as future information is not available and involves high uncertainty. However, without sufficient guidance for capturing future states of interacting agents, they frequently produce unrealistic trajectory overlaps. In this work, we propose Future Interaction modeling for Motion Prediction (FIMP), which captures potential future interactions in an end-to-end manner. FIMP adopts a future decoder that implicitly extracts the potential future information in an intermediate feature-level, and identifies the interacting entity pairs through future affinity learning and top-k filtering strategy. Experiments show that our future interaction modeling improves the performance remarkably, leading to superior performance on the Argoverse motion forecasting benchmark.
翻译:多智能体运动预测是自动驾驶中的关键问题,但由于动态智能体的模糊意图及其复杂的交互,该任务仍面临挑战。现有研究试图利用历史时间步中的确定性数据捕捉道路实体间的交互,因为未来信息不可获取且具有高度不确定性。然而,由于缺乏引导模型捕捉交互智能体未来状态的充分机制,这些方法常产生不真实的轨迹重叠。本文提出面向运动预测的未来交互建模(FIMP),以端到端方式捕捉潜在的未来交互。FIMP采用未来解码器,在中间特征层面隐式提取潜在未来信息,并通过未来亲和度学习与top-k筛选策略识别交互实体对。实验表明,我们的未来交互建模显著提升了模型性能,在Argoverse运动预测基准上取得了领先结果。