We investigate the problem of estimating the structure of a weighted network from repeated measurements of a Gaussian Graphical Model (GGM) on the network. In this vein, we consider GGMs whose covariance structures align with the geometry of the weighted network on which they are based. Such GGMs have been of longstanding interest in statistical physics, and are referred to as the Gaussian Free Field (GFF). In recent years, they have attracted considerable interest in the machine learning and theoretical computer science. In this work, we propose a novel estimator for the weighted network (equivalently, its Laplacian) from repeated measurements of a GFF on the network, based on the Fourier analytic properties of the Gaussian distribution. In this pursuit, our approach exploits complex-valued statistics constructed from observed data, that are of interest on their own right. We demonstrate the effectiveness of our estimator with concrete recovery guarantees and bounds on the required sample complexity. In particular, we show that the proposed statistic achieves the parametric rate of estimation for fixed network size. In the setting of networks growing with sample size, our results show that for Erdos-Renyi random graphs $G(d,p)$ above the connectivity threshold, we demonstrate that network recovery takes place with high probability as soon as the sample size $n$ satisfies $n \gg d^4 \log d \cdot p^{-2}$.
翻译:我们研究了从网络上高斯图模型(GGM)的重复测量中估计加权网络结构的问题。在此思路下,我们考虑其协方差结构与所基于的加权网络几何特性一致的GGM。这类GGM在统计物理学中长期以来备受关注,被称为高斯自由场(GFF)。近年来,它们在机器学习和理论计算机科学领域引起了广泛兴趣。本文中,我们基于高斯分布的傅里叶分析性质,提出了一种从网络上GFF的重复测量中估计加权网络(等价于其拉普拉斯矩阵)的新估计量。在此过程中,我们的方法利用了从观测数据构建的复值统计量,这些统计量本身也具有研究价值。我们通过具体的恢复保证和所需样本复杂度的界限证明了该估计量的有效性。特别地,我们证明对于固定网络大小,所提出的统计量达到了参数化估计速率。在网络规模随样本量增长的设定下,对于连通性阈值以上的Erdos-Renyi随机图$G(d,p)$,我们的结果表明,一旦样本量$n$满足$n \gg d^4 \log d \cdot p^{-2}$,网络恢复以高概率实现。