Multi-modal trajectory forecasting methods commonly evaluate using single-agent metrics (marginal metrics), such as minimum Average Displacement Error (ADE) and Final Displacement Error (FDE), which fail to capture joint performance of multiple interacting agents. Only focusing on marginal metrics can lead to unnatural predictions, such as colliding trajectories or diverging trajectories for people who are clearly walking together as a group. Consequently, methods optimized for marginal metrics lead to overly-optimistic estimations of performance, which is detrimental to progress in trajectory forecasting research. In response to the limitations of marginal metrics, we present the first comprehensive evaluation of state-of-the-art (SOTA) trajectory forecasting methods with respect to multi-agent metrics (joint metrics): JADE, JFDE, and collision rate. We demonstrate the importance of joint metrics as opposed to marginal metrics with quantitative evidence and qualitative examples drawn from the ETH / UCY and Stanford Drone datasets. We introduce a new loss function incorporating joint metrics that, when applied to a SOTA trajectory forecasting method, achieves a 7\% improvement in JADE / JFDE on the ETH / UCY datasets with respect to the previous SOTA. Our results also indicate that optimizing for joint metrics naturally leads to an improvement in interaction modeling, as evidenced by a 16\% decrease in mean collision rate on the ETH / UCY datasets with respect to the previous SOTA. Code is available at \texttt{\hyperlink{https://github.com/ericaweng/joint-metrics-matter}{github.com/ericaweng/joint-metrics-matter}}.
翻译:多模态轨迹预测方法通常使用单智能体指标(边际指标)进行评估,例如最小平均位移误差(ADE)和最终位移误差(FDE),但这类指标无法捕捉多个交互智能体的联合表现。仅关注边际指标可能导致不自然的预测结果,例如轨迹碰撞,或对于明显结伴同行的人群出现轨迹分散等情况。因此,针对边际指标优化的方法会高估性能,这对轨迹预测研究的进展不利。针对边际指标的局限性,我们首次基于多智能体指标(联合指标)——JADE、JFDE和碰撞率,对当前最优(SOTA)轨迹预测方法进行了全面评估。通过ETH / UCY和Stanford Drone数据集上的量化证据与定性示例,我们论证了联合指标相对于边际指标的重要性。我们提出了一种融合联合指标的新损失函数,将其应用于当前最优轨迹预测方法后,在ETH / UCY数据集上JADE/JFDE较此前最优方法提升了7%。我们的结果还表明,针对联合指标进行优化自然能够改善交互建模,证据是ETH / UCY数据集上的平均碰撞率较此前最优方法下降了16%。代码见\texttt{\hyperlink{https://github.com/ericaweng/joint-metrics-matter}{github.com/ericaweng/joint-metrics-matter}}。