Pedestrian trajectory prediction is the key technology in many applications for providing insights into human behavior and anticipating human future motions. Most existing empirical models are explicitly formulated by observed human behaviors using explicable mathematical terms with a deterministic nature, while recent work has focused on developing hybrid models combined with learning-based techniques for powerful expressiveness while maintaining explainability. However, the deterministic nature of the learned steering behaviors from the empirical models limits the models' practical performance. To address this issue, this work proposes the social conditional variational autoencoder (SocialCVAE) for predicting pedestrian trajectories, which employs a CVAE to explore behavioral uncertainty in human motion decisions. SocialCVAE learns socially reasonable motion randomness by utilizing a socially explainable interaction energy map as the CVAE's condition, which illustrates the future occupancy of each pedestrian's local neighborhood area. The energy map is generated using an energy-based interaction model, which anticipates the energy cost (i.e., repulsion intensity) of pedestrians' interactions with neighbors. Experimental results on two public benchmarks including 25 scenes demonstrate that SocialCVAE significantly improves prediction accuracy compared with the state-of-the-art methods, with up to 16.85% improvement in Average Displacement Error (ADE) and 69.18% improvement in Final Displacement Error (FDE).
翻译:行人轨迹预测是许多应用中理解人类行为并预判未来运动的关键技术。现有经验模型多通过可解释的数学术语以确定性方式显式建模观测到的人类行为,而近期研究则集中于发展结合学习技术的混合模型,在保持可解释性的同时增强表达能力。然而,经验模型中学到的转向行为的确定性本质限制了其实际性能。为解决此问题,本文提出社会条件变分自编码器(SocialCVAE)用于预测行人轨迹,该模型利用CVAE探索人类运动决策中的行为不确定性。SocialCVAE通过将社会可解释交互能量图作为CVAE的条件来学习具有社会合理性的运动随机性,该能量图刻画了每个行人局部邻域的未来占据情况。能量图由基于能量的交互模型生成,该模型可预测行人与其邻居交互的能量成本(即排斥强度)。在包含25个场景的两个公开基准数据集上的实验结果表明,与现有最优方法相比,SocialCVAE显著提升了预测精度,平均位移误差(ADE)提升高达16.85%,最终位移误差(FDE)提升高达69.18%。