Due to the complex and changing interactions in dynamic scenarios, motion forecasting is a challenging problem in autonomous driving. Most existing works exploit static road graphs to characterize scenarios and are limited in modeling evolving spatio-temporal dependencies in dynamic scenarios. In this paper, we resort to dynamic heterogeneous graphs to model the scenario. Various scenario components including vehicles (agents) and lanes, multi-type interactions, and their changes over time are jointly encoded. Furthermore, we design a novel heterogeneous graph convolutional recurrent network, aggregating diverse interaction information and capturing their evolution, to learn to exploit intrinsic spatio-temporal dependencies in dynamic graphs and obtain effective representations of dynamic scenarios. Finally, with a motion forecasting decoder, our model predicts realistic and multi-modal future trajectories of agents and outperforms state-of-the-art published works on several motion forecasting benchmarks.
翻译:由于动态场景中复杂且不断变化的交互关系,运动预测成为自动驾驶领域中的一个难题。现有研究大多利用静态道路图对场景进行建模,但在表征动态场景中不断演化的时空依赖关系方面存在局限性。本文采用动态异构图对场景进行建模,将包括车辆(智能体)和车道在内的多种场景组件、多类型交互及其随时间的变化进行联合编码。此外,我们设计了一种新颖的异构图卷积循环网络,该网络能够聚合多样化的交互信息并捕捉其演化过程,从而学习利用动态图中固有的时空依赖关系,获取动态场景的有效表示。最终,通过运动预测解码器,我们的模型能够预测智能体真实且多模态的未来轨迹,并在多个运动预测基准测试中超越了现有最先进的已发表成果。