Model complexity remains a key feature of any proposed data generating mechanism. Measures of complexity can be extended to complex patterns such as signals in time and graphs. In this paper, we are concerned with the well-studied class of exchangeable graphs. Exchangeability for graphs implies a distributional invariance under node permutation and is a suitable default model that can widely be used for network data. For this well-studied class of graphs, we make a choice to quantify model complexity based on the (Shannon) entropy, resulting in graphon entropy. We estimate the entropy of the generating mechanism of a given graph, instead of choosing a specific graph descriptor suitable only for one graph generating mechanism. In this manner, we naturally consider the global properties of a graph and capture its important graph-theoretic and topological properties. Under an increasingly complex set of generating mechanisms, we propose a set of estimators of graphon entropy as measures of complexity for real-world graphs. We determine the large--sample properties of such estimators and discuss their usage for characterizing evolving real-world graphs.
翻译:模型复杂度仍然是任何数据生成机制的核心特征。复杂度度量可推广至时间信号和图等复杂模式。本文聚焦于已被广泛研究的可交换图类别。图的可交换性意味着节点排列下的分布不变性,是一种适用于网络数据的通用默认模型。针对这类研究充分的图结构,我们选择基于(香农)熵来量化模型复杂度,从而得到图熵。我们通过估计给定图生成机制的熵,而非选择仅适用于单一图生成机制的特定图描述符,自然地考虑了图的全局属性,并捕捉其重要的图论与拓扑特性。在日益复杂的生成机制集合下,我们提出一组图熵估计量,作为真实世界图的复杂度度量。我们确定了此类估计量的大样本性质,并讨论了其在刻画演化中真实世界图特征方面的应用。