Network reconstruction is the task of inferring the unseen interactions between elements of a system, based only on their behavior or dynamics. This inverse problem is in general ill-posed, and admits many solutions for the same observation. Nevertheless, the vast majority of statistical methods proposed for this task -- formulated as the inference of a graphical generative model -- can only produce a ``point estimate,'' i.e. a single network considered the most likely. In general, this can give only a limited characterization of the reconstruction, since uncertainties and competing answers cannot be conveyed, even if their probabilities are comparable, while being structurally different. In this work we present an efficient MCMC algorithm for sampling from posterior distributions of reconstructed networks, which is able to reveal the full population of answers for a given reconstruction problem, weighted according to their plausibilities. Our algorithm is general, since it does not rely on specific properties of particular generative models, and is specially suited for the inference of large and sparse networks, since in this case an iteration can be performed in time $O(N\log^2 N)$ for a network of $N$ nodes, instead of $O(N^2)$, as would be the case for a more naive approach. We demonstrate the suitability of our method in providing uncertainties and consensus of solutions (which provably increases the reconstruction accuracy) in a variety of synthetic and empirical cases.
翻译:网络重构任务旨在仅基于系统元素的行为或动力学,推断其之间未被观测到的相互作用。这一逆问题通常是不适定的,对于同一观测数据可能存在多种解。然而,针对该任务提出的大多数统计方法——通常被表述为图生成模型的推断——仅能产生“点估计”,即一个被认为最可能的单一网络。一般而言,这只能提供重构的有限表征,因为即使存在概率相当但结构不同的竞争性解,其不确定性和替代答案也无法被呈现。本研究提出了一种高效的MCMC算法,用于从重构网络的后验分布中进行采样,该算法能够揭示给定重构问题的全部可能解集,并按其合理性进行加权。我们的算法具有通用性,因其不依赖于特定生成模型的具体性质,且特别适用于大规模稀疏网络的推断:对于包含$N$个节点的网络,每次迭代可在$O(N\log^2 N)$时间内完成,而更简单的方法则需要$O(N^2)$时间。我们通过多种合成与实证案例,证明了该方法在提供解的不确定性与共识(经证明可提高重构精度)方面的适用性。