We introduce RedMotion, a transformer model for motion prediction in self-driving vehicles that learns environment representations via redundancy reduction. Our first type of redundancy reduction is induced by an internal transformer decoder and reduces a variable-sized set of local road environment tokens, representing road graphs and agent data, to a fixed-sized global embedding. The second type of redundancy reduction is obtained by self-supervised learning and applies the redundancy reduction principle to embeddings generated from augmented views of road environments. Our experiments reveal that our representation learning approach outperforms PreTraM, Traj-MAE, and GraphDINO in a semi-supervised setting. Moreover, RedMotion achieves competitive results compared to HPTR or MTR++ in the Waymo Motion Prediction Challenge. Our open-source implementation is available at: https://github.com/kit-mrt/future-motion
翻译:本文提出RedMotion,一种用于自动驾驶车辆运动预测的Transformer模型,该模型通过冗余缩减学习环境表征。我们的第一种冗余缩减由内部Transformer解码器实现,将表示道路图和智能体数据的可变规模局部道路环境标记缩减为固定规模的全局嵌入。第二种冗余缩减通过自监督学习获得,将冗余缩减原理应用于从增强道路环境视图生成的嵌入。实验表明,我们的表征学习方法在半监督设置下优于PreTraM、Traj-MAE和GraphDINO。此外,在Waymo运动预测挑战赛中,RedMotion相比HPTR或MTR++取得了具有竞争力的结果。我们的开源实现位于:https://github.com/kit-mrt/future-motion