The pedestrian trajectory prediction task is an essential component of intelligent systems. Its applications include but are not limited to autonomous driving, robot navigation, and anomaly detection of monitoring systems. Due to the diversity of motion behaviors and the complex social interactions among pedestrians, accurately forecasting their future trajectory is challenging. Existing approaches commonly adopt GANs or CVAEs to generate diverse trajectories. However, GAN-based methods do not directly model data in a latent space, which may make them fail to have full support over the underlying data distribution; CVAE-based methods optimize a lower bound on the log-likelihood of observations, which may cause the learned distribution to deviate from the underlying distribution. The above limitations make existing approaches often generate highly biased or inaccurate trajectories. In this paper, we propose a novel generative flow based framework with dual graphormer for pedestrian trajectory prediction (STGlow). Different from previous approaches, our method can more precisely model the underlying data distribution by optimizing the exact log-likelihood of motion behaviors. Besides, our method has clear physical meanings for simulating the evolution of human motion behaviors. The forward process of the flow gradually degrades complex motion behavior into simple behavior, while its reverse process represents the evolution of simple behavior into complex motion behavior. Further, we introduce a dual graphormer combining with the graph structure to more adequately model the temporal dependencies and the mutual spatial interactions. Experimental results on several benchmarks demonstrate that our method achieves much better performance compared to previous state-of-the-art approaches.
翻译:行人轨迹预测任务是智能系统的重要组成部分,其应用包括但不限于自动驾驶、机器人导航和监控系统异常检测。由于运动行为的多样性以及行人之间复杂的社会交互,准确预测其未来轨迹具有挑战性。现有方法通常采用生成对抗网络(GAN)或条件变分自编码器(CVAE)来生成多样化轨迹。然而,基于GAN的方法未在潜在空间中直接建模数据,可能导致无法完全覆盖底层数据分布;基于CVAE的方法优化观测对数似然的下界,可能导致学习分布偏离底层分布。上述局限性使得现有方法常生成高度偏差或不准确的轨迹。本文提出一种基于流模型的新型生成框架,结合双图结构用于行人轨迹预测(STGlow)。与先前方法不同,我们的方法通过优化运动行为的精确对数似然,能更精确地建模底层数据分布。此外,该方法模拟人类运动行为演化过程具有明确的物理意义:流的前向过程将复杂运动行为逐步退化为简单行为,而反向过程则表征简单行为向复杂运动行为的演化。进一步,我们引入结合图结构的双图结构(Dual Graphormer),以更充分建模时间依赖性和空间互交互。多个基准数据集上的实验结果表明,与先前最先进方法相比,我们的方法取得了显著更优的性能。