Motion forecasting (MF) for autonomous driving aims at anticipating trajectories of surrounding agents in complex urban scenarios. In this work, we investigate a mixed strategy in MF training that first pre-train motion forecasters on pseudo-labeled data, then fine-tune them on annotated data. To obtain pseudo-labeled trajectories, we propose a simple pipeline that leverages off-the-shelf single-frame 3D object detectors and non-learning trackers. The whole pre-training strategy including pseudo-labeling is coined as PPT. Our extensive experiments demonstrate that: (1) combining PPT with supervised fine-tuning on annotated data achieves superior performance on diverse testbeds, especially under annotation-efficient regimes, (2) scaling up to multiple datasets improves the previous state-of-the-art and (3) PPT helps enhance cross-dataset generalization. Our findings showcase PPT as a promising pre-training solution for robust motion forecasting in diverse autonomous driving contexts.
翻译:自动驾驶中的运动预测旨在复杂城市场景中预测周围智能体的轨迹。本研究探讨了一种混合训练策略:首先在伪标签数据上对运动预测模型进行预训练,随后在标注数据上进行微调。为获取伪标签轨迹,我们提出一种简洁流程,利用现成的单帧3D目标检测器与非学习型跟踪器。该包含伪标签生成的完整预训练策略被命名为PPT。大量实验表明:(1) 将PPT与标注数据的监督微调相结合,能在多样化测试集上取得更优性能,尤其在标注效率受限场景下;(2) 扩展至多数据集训练可提升现有最优结果;(3) PPT有助于增强跨数据集泛化能力。本研究证明PPT是面向多样化自动驾驶场景实现鲁棒运动预测的有效预训练方案。