Understanding human motion is crucial for accurate pedestrian trajectory prediction. Conventional methods typically rely on supervised learning, where ground-truth labels are directly optimized against predicted trajectories. This amplifies the limitations caused by long-tailed data distributions, making it difficult for the model to capture abnormal behaviors. In this work, we propose a self-supervised pedestrian trajectory prediction framework that explicitly models position, velocity, and acceleration. We leverage velocity and acceleration information to enhance position prediction through feature injection and a self-supervised motion consistency mechanism. Our model hierarchically injects velocity features into the position stream. Acceleration features are injected into the velocity stream. This enables the model to predict position, velocity, and acceleration jointly. From the predicted position, we compute corresponding pseudo velocity and acceleration, allowing the model to learn from data-generated pseudo labels and thus achieve self-supervised learning. We further design a motion consistency evaluation strategy grounded in physical principles; it selects the most reasonable predicted motion trend by comparing it with historical dynamics and uses this trend to guide and constrain trajectory generation. We conduct experiments on the ETH-UCY and Stanford Drone datasets, demonstrating that our method achieves state-of-the-art performance on both datasets.
翻译:理解人体运动对于准确预测行人轨迹至关重要。传统方法通常依赖于监督学习,即直接针对预测轨迹优化真实标签。这放大了由长尾数据分布引起的局限性,使得模型难以捕捉异常行为。在本工作中,我们提出了一种自监督的行人轨迹预测框架,该框架显式地对位置、速度和加速度进行建模。我们利用速度和加速度信息,通过特征注入和自监督运动一致性机制来增强位置预测。我们的模型将速度特征分层注入到位置流中,并将加速度特征注入到速度流中。这使得模型能够联合预测位置、速度和加速度。根据预测的位置,我们计算相应的伪速度和加速度,使模型能够从数据生成的伪标签中学习,从而实现自监督学习。我们进一步设计了一种基于物理原理的运动一致性评估策略;该策略通过将预测的运动趋势与历史动态进行比较,选择最合理的预测运动趋势,并利用该趋势来指导和约束轨迹生成。我们在ETH-UCY和Stanford Drone数据集上进行了实验,结果表明我们的方法在两个数据集上都达到了最先进的性能。