We consider a specific graph learning task: reconstructing a symmetric matrix that represents an underlying graph using linear measurements. We present a sparsity characterization for distributions of random graphs (that are allowed to contain high-degree nodes), based on which we study fundamental trade-offs between the number of measurements, the complexity of the graph class, and the probability of error. We first derive a necessary condition on the number of measurements. Then, by considering a three-stage recovery scheme, we give a sufficient condition for recovery. Furthermore, assuming the measurements are Gaussian IID, we prove upper and lower bounds on the (worst-case) sample complexity for both noisy and noiseless recovery. In the special cases of the uniform distribution on trees with n nodes and the Erdos-Renyi (n,p) class, the fundamental trade-offs are tight up to multiplicative factors with noiseless measurements. In addition, for practical applications, we design and implement a polynomial-time (in n) algorithm based on the three-stage recovery scheme. Experiments show that the heuristic algorithm outperforms basis pursuit on star graphs. We apply the heuristic algorithm to learn admittance matrices in electric grids. Simulations for several canonical graph classes and IEEE power system test cases demonstrate the effectiveness and robustness of the proposed algorithm for parameter reconstruction.
翻译:我们考虑一个具体的图学习任务:利用线性测量重构表示底层图的对称矩阵。针对允许包含高度数节点的随机图分布,我们提出了一种稀疏性刻画方法,并基于此研究了测量数量、图类复杂度与错误概率之间的基本权衡关系。首先推导出测量数量的必要条件,进而通过考虑三阶段恢复方案给出恢复的充分条件。进一步,假设测量值为独立同分布高斯随机变量,我们证明了有噪与无噪恢复情况下(最坏情形)样本复杂度的上界和下界。对于n节点树的均匀分布与Erdos-Renyi (n,p)类等特例,在无噪测量条件下基本权衡可达到乘法因子意义上的紧性。此外,面向实际应用,我们设计并实现了基于三阶段恢复方案的多项式时间(关于n)算法。实验表明,该启发式算法在星形图上优于基追踪方法。我们将该启发式算法应用于电网导纳矩阵的学习。针对若干典型图类及IEEE电力系统测试案例的仿真结果,验证了所提算法在参数重构中的有效性和鲁棒性。