Motion forecasting is a key module in an autonomous driving system. Due to the heterogeneous nature of multi-sourced input, multimodality in agent behavior, and low latency required by onboard deployment, this task is notoriously challenging. To cope with these difficulties, this paper proposes a novel agent-centric model with anchor-informed proposals for efficient multimodal motion prediction. We design a modality-agnostic strategy to concisely encode the complex input in a unified manner. We generate diverse proposals, fused with anchors bearing goal-oriented scene context, to induce multimodal prediction that covers a wide range of future trajectories. Our network architecture is highly uniform and succinct, leading to an efficient model amenable for real-world driving deployment. Experiments reveal that our agent-centric network compares favorably with the state-of-the-art methods in prediction accuracy, while achieving scene-centric level inference latency.
翻译:运动预测是自动驾驶系统中的关键模块。由于多源输入的异构特性、智能体行为的多模态性以及车载部署所需的低延迟,这一任务极具挑战性。为解决这些难题,本文提出了一种新颖的智能体中心模型,通过锚点引导的提案实现高效多模态运动预测。我们设计了一种模态无关策略,以统一方式简洁地编码复杂输入。通过生成多样化的提案,并与携带目标导向场景上下文的锚点融合,诱导出覆盖广泛未来轨迹的多模态预测。我们的网络架构高度统一且简洁,使得模型高效,适用于真实驾驶部署。实验表明,我们的智能体中心网络在预测精度上可与最先进方法相媲美,同时实现了场景中心级别的推理延迟。