探索Transformer增强LSTM在轨迹预测中的时空特征学习 (Exploring Transformer-Augmented LSTM for Temporal and Spatial Feature Learning in Trajectory Prediction)

Accurate vehicle trajectory prediction is crucial for ensuring safe and efficient autonomous driving. This work explores the integration of Transformer based model with Long Short-Term Memory (LSTM) based technique to enhance spatial and temporal feature learning in vehicle trajectory prediction. Here, a hybrid model that combines LSTMs for temporal encoding with a Transformer encoder for capturing complex interactions between vehicles is proposed. Spatial trajectory features of the neighboring vehicles are processed and goes through a masked scatter mechanism in a grid based environment, which is then combined with temporal trajectory of the vehicles. This combined trajectory data are learned by sequential LSTM encoding and Transformer based attention layers. The proposed model is benchmarked against predecessor LSTM based methods, including STA-LSTM, SA-LSTM, CS-LSTM, and NaiveLSTM. Our results, while not outperforming it's predecessor, demonstrate the potential of integrating Transformers with LSTM based technique to build interpretable trajectory prediction model. Future work will explore alternative architectures using Transformer applications to further enhance performance. This study provides a promising direction for improving trajectory prediction models by leveraging transformer based architectures, paving the way for more robust and interpretable vehicle trajectory prediction system.

翻译：精确的车辆轨迹预测对于确保自动驾驶的安全性和效率至关重要。本研究探索将基于Transformer的模型与基于长短期记忆（LSTM）的技术相结合，以增强车辆轨迹预测中的空间和时间特征学习。本文提出了一种混合模型，该模型结合了用于时间编码的LSTM和用于捕获车辆间复杂交互的Transformer编码器。邻车的空间轨迹特征在基于网格的环境中被处理并通过掩码散射机制，随后与车辆的时间轨迹相结合。这些组合的轨迹数据通过序列LSTM编码和基于Transformer的注意力层进行学习。所提出的模型与基于LSTM的先前方法进行了基准测试，包括STA-LSTM、SA-LSTM、CS-LSTM和NaiveLSTM。我们的结果虽然未超越其前身，但证明了将Transformer与基于LSTM的技术相结合以构建可解释轨迹预测模型的潜力。未来的工作将探索使用Transformer应用的替代架构以进一步提升性能。本研究通过利用基于Transformer的架构，为改进轨迹预测模型提供了一个有前景的方向，为构建更鲁棒和可解释的车辆轨迹预测系统铺平了道路。

相关内容

长短期记忆网络

关注 120

长短期记忆网络(LSTM)是一种用于深度学习领域的人工回归神经网络(RNN)结构。与标准的前馈神经网络不同，LSTM具有反馈连接。它不仅可以处理单个数据点(如图像)，还可以处理整个数据序列(如语音或视频)。例如，LSTM适用于未分段、连接的手写识别、语音识别、网络流量或IDSs(入侵检测系统)中的异常检测等任务。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日