In this paper, we propose a complexity measure for exchangeable graphs by considering the graph-generating mechanism. Exchangeability for graphs implies distributional invariance under node permutations, making it a suitable default model for a wide range of graph data. For this well-studied class of graphs, we quantify complexity using graphon entropy. Graphon entropy is a graph property, meaning it is invariant under graph isomorphisms. Therefore, we focus on estimating the entropy of the generating mechanism for a graph realization, rather than selecting a specific graph feature. This approach allows us to consider the global properties of a graph, capturing its important graph-theoretic and topological characteristics, such as sparsity, symmetry, and connectedness. We introduce a consistent graphon entropy estimator that achieves the nonparametric rate for any arbitrary exchangeable graph with a smooth graphon representative. Additionally, we develop tailored entropy estimators for situations where more information about the underlying graphon is available, specifically for widely studied random graph models such as Erd\H{o}s-R\'enyi, Configuration Model and Stochastic Block Model. We determine their large-sample properties by providing a Central Limit Theorem for the first two, and a convergence rate for the third model. We also conduct a simulation study to illustrate our theoretical findings and demonstrate the connection between graphon entropy and graph structure. Finally, we investigate the role of our entropy estimator as a complexity measure for characterizing real-world graphs.
翻译:本文提出了一种基于图生成机制的可交换图复杂度度量方法。图的交换性意味着在节点置换下分布具有不变性,因此它成为处理各类图数据的默认模型。针对这类已被广泛研究的图模型,我们利用图熵(graphon entropy)量化其复杂度。图熵是一种图属性,具有图同构不变性。因此,我们聚焦于估计图实现过程中生成机制的熵值,而非选取特定的图特征。该方法使我们能够从全局角度考量图的性质,捕捉其重要图论与拓扑特征(如稀疏性、对称性及连通性)。我们提出了一致性的图熵估计量,对于具有光滑图表示(graphon representative)的任意可交换图,该估计量能达到非参数收敛速率。此外,针对图模型中更多已知信息(如广泛研究的随机图模型:Erdős–Rényi模型、配置模型及随机块模型)的场景,我们开发了定制化的熵估计方法。通过为前两种模型建立中心极限定理,并为第三种模型推导收敛速率,我们确定了其大样本性质。通过模拟研究验证了理论结果,并揭示了图熵与图结构之间的关联。最后,我们探讨了熵估计量作为复杂度度量在实际图特征刻画中的作用。