Graph neural networks are often used to model interacting dynamical systems since they gracefully scale to systems with a varying and high number of agents. While there has been much progress made for deterministic interacting systems, modeling is much more challenging for stochastic systems in which one is interested in obtaining a predictive distribution over future trajectories. Existing methods are either computationally slow since they rely on Monte Carlo sampling or make simplifying assumptions such that the predictive distribution is unimodal. In this work, we present a deep state-space model which employs graph neural networks in order to model the underlying interacting dynamical system. The predictive distribution is multimodal and has the form of a Gaussian mixture model, where the moments of the Gaussian components can be computed via deterministic moment matching rules. Our moment matching scheme can be exploited for sample-free inference, leading to more efficient and stable training compared to Monte Carlo alternatives. Furthermore, we propose structured approximations to the covariance matrices of the Gaussian components in order to scale up to systems with many agents. We benchmark our novel framework on two challenging autonomous driving datasets. Both confirm the benefits of our method compared to state-of-the-art methods. We further demonstrate the usefulness of our individual contributions in a carefully designed ablation study and provide a detailed runtime analysis of our proposed covariance approximations. Finally, we empirically demonstrate the generalization ability of our method by evaluating its performance on unseen scenarios.
翻译:图神经网络常被用于建模交互动力系统,因其能优雅地适应具有可变数量及高密度智能体的系统。尽管确定性交互系统的研究已取得显著进展,但随机系统的建模仍面临巨大挑战——该类系统需要获得未来轨迹的预测分布。现有方法要么依赖蒙特卡洛采样导致计算缓慢,要么通过简化假设使预测分布仅保留单峰特性。本文提出一种深度融合状态空间模型,通过图神经网络刻画底层交互动力系统。该模型的预测分布呈多峰高斯混合形式,其中各高斯分量的矩可通过确定性矩匹配规则计算。这种矩匹配机制可实现无样本推断,相较蒙特卡洛方法获得更高效稳定的训练效果。此外,我们提出高斯分量协方差矩阵的结构化近似方法,使系统能扩展至包含大量智能体的场景。我们在两个具有挑战性的自动驾驶数据集上验证了该框架,结果均证实了本方法相较于现有最优方法的优势。通过精心设计的消融实验进一步验证了各项技术贡献的实用性,并对所提出的协方差近似方案进行了详细的运行时分析。最后,我们通过评估模型在未见场景上的表现,实证证明了其泛化能力。