Estimating the joint distribution of on-road agents' future trajectories is essential for autonomous driving. In this technical report, we propose a next-generation framework for joint multi-agent trajectory prediction called QCNeXt. First, we adopt the query-centric encoding paradigm for the task of joint multi-agent trajectory prediction. Powered by this encoding scheme, our scene encoder is equipped with permutation equivariance on the set elements, roto-translation invariance in the space dimension, and translation invariance in the time dimension. These invariance properties not only enable accurate multi-agent forecasting fundamentally but also empower the encoder with the capability of streaming processing. Second, we propose a multi-agent DETR-like decoder, which facilitates joint multi-agent trajectory prediction by modeling agents' interactions at future time steps. For the first time, we show that a joint prediction model can outperform marginal prediction models even on the marginal metrics, which opens up new research opportunities in trajectory prediction. Our approach ranks 1st on the Argoverse 2 multi-agent motion forecasting benchmark, winning the championship of the Argoverse Challenge at the CVPR 2023 Workshop on Autonomous Driving.
翻译:估计道路上智能体未来轨迹的联合分布对于自动驾驶至关重要。本技术报告提出了一种用于多智能体联合轨迹预测的下一代框架,称为QCNeXt。首先,我们针对多智能体联合轨迹预测任务采用了以查询为中心的编码范式。基于该编码方案,我们的场景编码器在集合元素上具备置换等变性、在空间维度上具备旋转-平移不变性、以及在时间维度上具备平移不变性。这些不变性不仅从根本上实现了多智能体的精确预测,还赋予了编码器流式处理的能力。其次,我们提出了一个类似多智能体DETR的解码器,通过建模未来时间步上智能体之间的交互来促进多智能体联合轨迹预测。我们首次证明,联合预测模型即使在边际指标上也能超越边际预测模型,这为轨迹预测开辟了新的研究机会。我们的方法在Argoverse 2多智能体运动预测基准中排名第一,并在CVPR 2023自动驾驶研讨会的Argoverse挑战赛中夺冠。