A well-defined distance on the parameter space is key to evaluating estimators, ensuring consistency, and building confidence sets. While there are typically standard distances to adopt in a continuous space, this is not the case for combinatorial parameters such as graphs that represent statistical models. Defined on the graphs alone, existing proposals like the structural Hamming distance ignore the structure of the model space and can thus exhibit undesirable behaviors. We propose a model-oriented framework for defining the distance between graphs that is applicable across different graph classes. Our approach treats each graph as a statistical model and organizes the graphs in a partially ordered set based on model inclusion. This induces a neighborhood structure, from which we define the model-oriented distance as the length of a shortest path through neighbors, yielding a metric in the space of graphs. We apply this framework to probabilistic undirected graphs, causal directed acyclic graphs, probabilistic completed partially directed acyclic graphs, and causal maximally oriented partially directed acyclic graphs. We analyze theoretical and empirical behaviors of the model-oriented distance and draw comparison with existing distances. By exploiting the underlying poset structures, we develop algorithms for computing and bounding the proposed distance that scale to moderate-sized graphs.
翻译:参数空间上定义良好的距离对于评估估计量、确保一致性以及构建置信集至关重要。虽然在连续空间中通常有标准距离可供采用,但对于表示统计模型的图等组合参数而言,情况并非如此。现有方法(如结构汉明距离)仅在图本身定义,忽略了模型空间的结构,因此可能表现出不良特性。我们提出了一个模型导向的框架,用于定义图之间的距离,该框架适用于不同的图类。我们的方法将每个图视为一个统计模型,并根据模型包含关系将图组织在偏序集中。这诱导出一个邻域结构,我们据此将模型导向距离定义为通过相邻图的最短路径长度,从而在图空间中产生一个度量。我们将此框架应用于概率无向图、因果有向无环图、概率完全部分有向无环图以及因果最大定向部分有向无环图。我们分析了模型导向距离的理论和实证特性,并与现有距离进行了比较。通过利用底层的偏序集结构,我们开发了计算和界定所提距离的算法,这些算法可扩展到中等规模的图。