Accurate prediction of pedestrian trajectories is crucial for improving the safety of autonomous driving. However, this task is generally nontrivial due to the inherent stochasticity of human motion, which naturally requires the predictor to generate multi-modal prediction. Previous works leverage various generative methods, such as GAN and VAE, for pedestrian trajectory prediction. Nevertheless, these methods may suffer from mode collapse and relatively low-quality results. The denoising diffusion probabilistic model (DDPM) has recently been applied to trajectory prediction due to its simple training process and powerful reconstruction ability. However, current diffusion-based methods do not fully utilize input information and usually require many denoising iterations that lead to a long inference time or an additional network for initialization. To address these challenges and facilitate the use of diffusion models in multi-modal trajectory prediction, we propose GDTS, a novel Goal-Guided Diffusion Model with Tree Sampling for multi-modal trajectory prediction. Considering the "goal-driven" characteristics of human motion, GDTS leverages goal estimation to guide the generation of the diffusion network. A two-stage tree sampling algorithm is presented, which leverages common features to reduce the inference time and improve accuracy for multi-modal prediction. Experimental results demonstrate that our proposed framework achieves comparable state-of-the-art performance with real-time inference speed in public datasets.
翻译:准确预测行人轨迹对于提升自动驾驶安全性至关重要。然而,由于人类运动固有的随机性,该任务通常具有挑战性,这自然要求预测器能够生成多模态预测结果。先前研究利用多种生成方法(如GAN和VAE)进行行人轨迹预测,但这些方法可能面临模式坍塌和生成质量相对较低的问题。去噪扩散概率模型(DDPM)因其简单的训练过程和强大的重建能力,近期被应用于轨迹预测领域。然而,当前基于扩散的方法未能充分利用输入信息,且通常需要大量去噪迭代步骤,导致推理时间过长或需要额外网络进行初始化。为应对这些挑战并促进扩散模型在多模态轨迹预测中的应用,本文提出GDTS——一种基于目标引导扩散模型与树采样方法的新型多模态轨迹预测框架。针对人类运动“目标驱动”的特性,GDTS利用目标估计来引导扩散网络的生成过程。我们提出一种两阶段树采样算法,该算法通过利用共享特征来减少推理时间并提升多模态预测的准确性。实验结果表明,所提出的框架在公共数据集上实现了与先进方法相当的性能,同时具备实时推理速度。