Modeling spatial-temporal interactions among neighboring agents is at the heart of multi-agent problems such as motion forecasting and crowd navigation. Despite notable progress, it remains unclear to which extent modern representations can capture the causal relationships behind agent interactions. In this work, we take an in-depth look at the causal awareness of these representations, from computational formalism to real-world practice. First, we cast doubt on the notion of non-causal robustness studied in the recent CausalAgents benchmark. We show that recent representations are already partially resilient to perturbations of non-causal agents, and yet modeling indirect causal effects involving mediator agents remains challenging. To address this challenge, we introduce a metric learning approach that regularizes latent representations with causal annotations. Our controlled experiments show that this approach not only leads to higher degrees of causal awareness but also yields stronger out-of-distribution robustness. To further operationalize it in practice, we propose a sim-to-real causal transfer method via cross-domain multi-task learning. Experiments on pedestrian datasets show that our method can substantially boost generalization, even in the absence of real-world causal annotations. We hope our work provides a new perspective on the challenges and potential pathways towards causally-aware representations of multi-agent interactions. Our code is available at https://github.com/socialcausality.
翻译:建模相邻智能体之间的时空交互是多智能体问题(如运动预测和人群导航)的核心。尽管取得了显著进展,但现代表示能在多大程度上捕捉智能体交互背后的因果关系仍不清楚。本文从计算形式化到实际应用,深入探究了这些表示的因果感知能力。首先,我们对最近CausalAgents基准中研究的非因果鲁棒性概念提出质疑。我们证明,现有表示对非因果智能体的扰动已具备部分鲁棒性,然而建模涉及中介智能体的间接因果效应仍然具有挑战性。为解决这一挑战,我们提出了一种度量学习方法,通过因果标注对潜在表示进行正则化。控制实验表明,该方法不仅实现了更高程度的因果感知,而且展现出更强的分布外鲁棒性。为了进一步将其落地实践,我们通过跨域多任务学习提出了一种仿真到现实的因果迁移方法。行人数据集上的实验表明,即使缺乏真实世界因果标注,我们的方法也能显著提升泛化能力。我们希望本文能为多智能体交互因果感知表示面临的挑战与潜在路径提供新视角。我们的代码公开于 https://github.com/socialcausality。