Learning particle swarming models from data with Gaussian processes

Interacting particle or agent systems that display a rich variety of swarming behaviours are ubiquitous in science and engineering. A fundamental and challenging goal is to understand the link between individual interaction rules and swarming. In this paper, we study the data-driven discovery of a second-order particle swarming model that describes the evolution of $N$ particles in $\mathbb{R}^d$ under radial interactions. We propose a learning approach that models the latent radial interaction function as Gaussian processes, which can simultaneously fulfill two inference goals: one is the nonparametric inference of {the} interaction function with pointwise uncertainty quantification, and the other one is the inference of unknown scalar parameters in the non-collective friction forces of the system. We formulate the learning problem as a statistical inverse problem and provide a detailed analysis of recoverability conditions, establishing that a coercivity condition is sufficient for recoverability. Given data collected from $M$ i.i.d trajectories with independent Gaussian observational noise, we provide a finite-sample analysis, showing that our posterior mean estimator converges in a Reproducing kernel Hilbert space norm, at an optimal rate in $M$ equal to the one in the classical 1-dimensional Kernel Ridge regression. As a byproduct, we show we can obtain a parametric learning rate in $M$ for the posterior marginal variance using $L^{\infty}$ norm, and the rate could also involve $N$ and $L$ (the number of observation time instances for each trajectory), depending on the condition number of the inverse problem. Numerical results on systems that exhibit different swarming behaviors demonstrate efficient learning of our approach from scarce noisy trajectory data.

翻译：展示丰富多样群集行为的交互粒子或智能体系统在科学与工程中普遍存在。理解个体交互规则与群集行为之间的关联是一项基础且具有挑战性的目标。本文研究数据驱动的二阶粒子群集模型发现，该模型描述了在径向相互作用下 $\mathbb{R}^d$ 中 $N$ 个粒子的演化过程。我们提出一种学习方法，将潜在径向交互函数建模为高斯过程，可同时实现两个推理目标：一是通过逐点不确定性量化对交互函数进行非参数推断，二是推断系统中非集体摩擦力的未知标量参数。我们将学习问题形式化为统计逆问题，并详细分析可恢复条件，证明强制性条件足以保证可恢复性。给定从 $M$ 条独立同分布轨迹（含独立高斯观测噪声）收集的数据，我们提供有限样本分析，表明我们的后验均值估计量在再生核希尔伯特空间范数下收敛，其关于 $M$ 的最优收敛速率与经典一维核岭回归相同。作为副产品，我们证明可通过 $L^{\infty}$ 范数获得关于 $M$ 的后验边缘方差参数学习速率，且该速率可能涉及 $N$ 和 $L$（每条轨迹的观测时间点数量），具体取决于逆问题的条件数。针对呈现不同群集行为的系统数值结果表明，本方法能从稀疏含噪轨迹数据中实现高效学习。