Network structures underlie the dynamics of many complex phenomena, from gene regulation and foodwebs to power grids and social media. Yet, as they often cannot be observed directly, their connectivities must be inferred from observations of their emergent dynamics. In this work we present a powerful computational method to infer large network adjacency matrices from time series data using a neural network, in order to provide uncertainty quantification on the prediction in a manner that reflects both the non-convexity of the inference problem as well as the noise on the data. This is useful since network inference problems are typically underdetermined, and a feature that has hitherto been lacking from such methods. We demonstrate our method's capabilities by inferring line failure locations in the British power grid from its response to a power cut. Since the problem is underdetermined, many classical statistical tools (e.g. regression) will not be straightforwardly applicable. Our method, in contrast, provides probability densities on each edge, allowing the use of hypothesis testing to make meaningful probabilistic statements about the location of the power cut. We also demonstrate our method's ability to learn an entire cost matrix for a non-linear model of economic activity in Greater London. Our method outperforms OLS regression on noisy data in terms of both speed and prediction accuracy, and scales as $N^2$ where OLS is cubic. Not having been specifically engineered for network inference, our method represents a general parameter estimation scheme that is applicable to any parameter dimension.
翻译:网络结构是众多复杂现象动态行为的基础,从基因调控和食物网到电网和社交媒体。然而,由于这些结构通常无法直接观测,必须从其涌现动态的观测数据中推断连接性。本文提出了一种强大的计算方法,利用神经网络从时间序列数据中推断大规模网络邻接矩阵,并以反映推理问题非凸性和数据噪声的方式提供预测的不确定性量化。这一特性具有重要意义,因为网络推断问题通常是不适定的,而现有方法一直缺乏这一功能。我们通过英国电网对停电响应的线路故障位置推断来展示该方法的能力。由于问题的不适定性,许多经典统计工具(如回归)无法直接应用。相比之下,我们的方法提供了每条边的概率密度,使得可以通过假设检验对停电位置做出有意义的概率性陈述。我们还展示了该方法学习大伦敦地区非线性经济活动模型完整成本矩阵的能力。在含噪数据上,我们的方法在速度和预测精度方面均优于OLS回归,且计算复杂度为$N^2$,而OLS为立方复杂度。该方法并非专门为网络推断设计,而是一种适用于任意参数维度的通用参数估计方案。