Exponential-family random graph models (ERGMs) are a family of network models originating in social network analysis, which have also been applied to biological networks. Advances in estimation algorithms have increased the practical scope of these models to larger networks, however it is still not always possible to estimate a model without encountering problems of model near-degeneracy, particularly if it is desired to use only simple model parameters, rather than more complex parameters designed to overcome the problem of near-degeneracy. Two new network models related to the ERGM, the Tapered ERGM, and the latent order logistic (LOLOG) model, have recently been proposed to overcome this problem. In this work I illustrate the application of the Tapered ERGM and the LOLOG to a set of biological networks, including protein-protein interaction (PPI) networks, gene regulatory networks, and neural networks. I find that the Tapered ERGM and the LOLOG are able to estimate models for networks for which it was not possible to estimate a conventional ERGM, and are able to do so using only simple model parameters. In the case of two neural networks where data on the spatial position of neurons is available, this allows the estimation of models including terms for spatial distance and triangle structures, allowing triangle motif statistical significance to be estimated while accounting for the effect of spatial proximity on connection probability. For some larger networks, however, Tapered ERGM and LOLOG estimation was not possible in practical time, while conventional ERGM models were able to be estimated only by using the Equilibrium Expectation (EE) algorithm.
翻译:指数族随机图模型(ERGM)是一类源自社会网络分析的网络模型,现已应用于生物网络研究。估计算法的进步提升了这些模型在更大规模网络中的实际应用范围,然而在模型估计过程中仍时常遇到近退化问题,特别是当仅使用简单模型参数(而非为解决近退化问题而设计的复杂参数)时。为克服这一难题,近期提出了两种与ERGM相关的新型网络模型——锥形ERGM与潜序逻辑模型(LOLOG)。本研究展示了锥形ERGM和LOLOG在蛋白质-蛋白质互作(PPI)网络、基因调控网络及神经网络等生物网络中的应用。结果表明,锥形ERGM与LOLOG能够对传统ERGM无法估计的网络进行参数估计,且仅需使用简单模型参数即可实现。对于两个具备神经元空间位置数据的神经网络,该方法支持构建包含空间距离与三角结构项的模型,从而在控制空间邻近性对连接概率影响的条件下,评估三角基序的统计显著性。然而,对于部分较大规模网络,锥形ERGM与LOLOG无法在可行时间内完成估计,而传统ERGM仅能通过平衡期望(EE)算法实现参数估计。