Predicting pedestrian motion trajectories is crucial for path planning and motion control of autonomous vehicles. Accurately forecasting crowd trajectories is challenging due to the uncertain nature of human motions in different environments. For training, recent deep learning-based prediction approaches mainly utilize information like trajectory history and interactions between pedestrians, among others. This can limit the prediction performance across various scenarios since the discrepancies between training datasets have not been properly incorporated. To overcome this limitation, this paper proposes a graph transformer structure to improve prediction performance, capturing the differences between the various sites and scenarios contained in the datasets. In particular, a self-attention mechanism and a domain adaption module have been designed to improve the generalization ability of the model. Moreover, an additional metric considering cross-dataset sequences is introduced for training and performance evaluation purposes. The proposed framework is validated and compared against existing methods using popular public datasets, i.e., ETH and UCY. Experimental results demonstrate the improved performance of our proposed scheme.
翻译:预测行人运动轨迹对于自动驾驶汽车的路径规划和运动控制至关重要。由于人类在不同环境中的运动具有不确定性,准确预测人群轨迹具有挑战性。在训练过程中,近期基于深度学习的预测方法主要利用轨迹历史、行人间的交互等信息。这可能会限制模型在不同场景下的预测性能,因为训练数据集之间的差异未被恰当整合。为克服这一局限,本文提出一种图Transformer结构以提升预测性能,捕捉数据集中不同地点和场景的差异。具体而言,设计了自注意力机制和领域自适应模块,以提高模型的泛化能力。此外,引入了一项考虑跨数据集序列的额外指标,用于训练和性能评估。所提框架在常用公开数据集(即ETH和UCY)上进行了验证,并与现有方法进行了比较。实验结果表明,我们提出的方案具有更优的性能。