Trajectory prediction and planning are fundamental yet disconnected components in autonomous driving. Prediction models forecast surrounding agent motion under unknown intentions, producing multimodal distributions, while planning assumes known ego objectives and generates deterministic trajectories. This mismatch creates a critical bottleneck: prediction lacks supervision for agent intentions, while planning requires this information. Existing prediction models, despite strong benchmarking performance, often remain disconnected from planning constraints such as collision avoidance and dynamic feasibility. We introduce Plan TRansformer (PTR), a unified Gaussian Mixture Transformer framework integrating goal-conditioned prediction, dynamic feasibility, interaction awareness, and lane-level topology reasoning. A teacher-student training strategy progressively masks surrounding agent commands during training to align with inference conditions where agent intentions are unavailable. PTR achieves 4.3%/3.5% improvement in marginal/joint mAP compared to the baseline Motion Transformer (MTR) and 15.5% planning error reduction at 5s horizon compared to GameFormer. The architecture-agnostic design enables application to diverse Transformer-based prediction models. Project Website: https://github.com/SelzerConst/PlanTRansformer
翻译:轨迹预测与规划是自动驾驶中基础但相互割裂的组成部分。预测模型在未知意图下预测周围智能体的运动,产生多模态分布;而规划则假设已知自车目标,生成确定性轨迹。这种不匹配造成了关键瓶颈:预测缺乏对智能体意图的监督,而规划恰恰需要此类信息。现有的预测模型尽管在基准测试中表现优异,却常常与碰撞规避、动态可行性等规划约束相脱节。我们提出了Plan Transformer(PTR),一个统一的高斯混合Transformer框架,集成了目标条件预测、动态可行性、交互感知与车道级拓扑推理。采用师生训练策略,在训练过程中逐步掩蔽周围智能体的指令,以与推理时智能体意图未知的条件对齐。与基线Motion Transformer(MTR)相比,PTR在边缘/联合mAP上分别实现了4.3%/3.5%的提升;与GameFormer相比,在5秒规划时域上规划误差降低了15.5%。该架构无关的设计可应用于多种基于Transformer的预测模型。项目网站:https://github.com/SelzerConst/PlanTRansformer