Predicting the future behavior of agents is a fundamental task in autonomous vehicle domains. Accurate prediction relies on comprehending the surrounding map, which significantly regularizes agent behaviors. However, existing methods have limitations in exploiting the map and exhibit a strong dependence on historical trajectories, which yield unsatisfactory prediction performance and robustness. Additionally, their heavy network architectures impede real-time applications. To tackle these problems, we propose Map-Agent Coupled Transformer (MacFormer) for real-time and robust trajectory prediction. Our framework explicitly incorporates map constraints into the network via two carefully designed modules named coupled map and reference extractor. A novel multi-task optimization strategy (MTOS) is presented to enhance learning of topology and rule constraints. We also devise bilateral query scheme in context fusion for a more efficient and lightweight network. We evaluated our approach on Argoverse 1, Argoverse 2, and nuScenes real-world benchmarks, where it all achieved state-of-the-art performance with the lowest inference latency and smallest model size. Experiments also demonstrate that our framework is resilient to imperfect tracklet inputs. Furthermore, we show that by combining with our proposed strategies, classical models outperform their baselines, further validating the versatility of our framework.
翻译:预测智能体的未来行为是自动驾驶领域的基础任务。准确预测依赖于对周围地图的理解,因为地图显著约束了智能体的行为。然而,现有方法在利用地图方面存在局限性,且对历史轨迹表现出强依赖性,导致预测性能与鲁棒性不佳。此外,其冗重的网络架构阻碍了实时应用。为解决这些问题,我们提出了用于实时与鲁棒轨迹预测的地图-智能体耦合Transformer(MacFormer)。该框架通过精心设计的耦合地图与参考提取器两个模块,将地图约束显式融入网络。我们提出了一种新型多任务优化策略(MTOS),以增强对拓扑与规则约束的学习。此外,在上下文融合中设计了双边查询方案,以实现更高效轻量的网络。我们在Argoverse 1、Argoverse 2和nuScenes真实世界基准数据集上评估了方法,在所有数据集上均以最低推理延迟和最小模型尺寸取得了最先进性能。实验还表明,本框架对不完美的轨迹片段输入具有鲁棒性。进一步地,我们证明将所提策略与经典模型结合后,后者性能超越其基线,验证了本框架的通用性。