Predicting the future behavior of agents is a fundamental task in autonomous vehicle domains. Accurate prediction relies on comprehending the surrounding map, which significantly regularizes agent behaviors. However, existing methods have limitations in exploiting the map and exhibit a strong dependence on historical trajectories, which yield unsatisfactory prediction performance and robustness. Additionally, their heavy network architectures impede real-time applications. To tackle these problems, we propose Map-Agent Coupled Transformer (MacFormer) for real-time and robust trajectory prediction. Our framework explicitly incorporates map constraints into the network via two carefully designed modules named coupled map and reference extractor. A novel multi-task optimization strategy (MTOS) is presented to enhance learning of topology and rule constraints. We also devise bilateral query scheme in context fusion for a more efficient and lightweight network. We evaluated our approach on Argoverse 1, Argoverse 2, and nuScenes real-world benchmarks, where it all achieved state-of-the-art performance with the lowest inference latency and smallest model size. Experiments also demonstrate that our framework is resilient to imperfect tracklet inputs. Furthermore, we show that by combining with our proposed strategies, classical models outperform their baselines, further validating the versatility of our framework.
翻译:摘要:预测智能体的未来行为是自动驾驶领域的基础任务。精确预测依赖于对周边地图的理解,该地图显著约束了智能体的行为模式。然而,现有方法在利用地图方面存在局限性,且对历史轨迹表现出强烈依赖性,导致预测性能与鲁棒性欠佳。此外,其庞大网络架构阻碍了实时应用。为解决上述问题,我们提出地图-智能体耦合Transformer(MacFormer),用于实现实时鲁棒轨迹预测。本框架通过两个精心设计的模块——耦合地图与参考提取器——将地图约束显式注入网络。我们提出新型多任务优化策略(MTOS),以增强拓扑与规则约束的学习。同时,在上下文融合中设计双向查询机制,构建更高效轻量的网络。我们在Argoverse 1、Argoverse 2和nuScenes真实世界基准数据集上评估本方法,该方法均以最低推理延迟与最小模型尺寸实现最先进性能。实验表明,本框架对不完美的轨迹片段输入具有鲁棒性。进一步,我们证明结合所提策略后,经典模型超越其基线,验证了本框架的通用性。