Markov networks are probabilistic graphical models that employ undirected graphs to depict conditional independence relationships among variables. Our focus lies in constraint-based structure learning, which entails learning the undirected graph from data through the execution of conditional independence tests. We establish theoretical limits concerning two critical aspects of constraint-based learning of Markov networks: the number of tests and the sizes of the conditioning sets. These bounds uncover an exciting interplay between the structural properties of the graph and the amount of tests required to learn a Markov network. The starting point of our work is that the graph parameter maximum pairwise connectivity, $\kappa$, that is, the maximum number of vertex-disjoint paths connecting a pair of vertices in the graph, is responsible for the sizes of independence tests required to learn the graph. On one hand, we show that at least one test with the size of the conditioning set at least $\kappa$ is always necessary. On the other hand, we prove that any graph can be learned by performing tests of size at most $\kappa$. This completely resolves the question of the minimum size of conditioning sets required to learn the graph. When it comes to the number of tests, our upper bound on the sizes of conditioning sets implies that every $n$-vertex graph can be learned by at most $n^{\kappa}$ tests with conditioning sets of sizes at most $\kappa$. We show that for any upper bound $q$ on the sizes of the conditioning sets, there exist graphs with $O(n q)$ vertices that require at least $n^{\Omega(\kappa)}$ tests to learn. This lower bound holds even when the treewidth and the maximum degree of the graph are at most $\kappa+2$. On the positive side, we prove that every graph of bounded treewidth can be learned by a polynomial number of tests with conditioning sets of sizes at most $2\kappa$.
翻译:马尔可夫网络是一种概率图模型,利用无向图来表达变量间的条件独立关系。我们聚焦于基于约束的结构学习,即通过执行条件独立性检验从数据中学习无向图。针对基于约束的马尔可夫网络学习中的两个关键方面——检验次数与条件集大小——我们建立了理论极限。这些界限揭示了图的属性与学习马尔可夫网络所需检验次数之间的有趣关联。本工作的出发点在于:图参数“最大成对连通性”κ(即图中连接任意两顶点间顶点不相交路径的最大数量)决定了学习该图所需独立性检验的条件集大小。一方面,我们证明至少一次条件集大小≥κ的检验总是必要的;另一方面,我们证明任何图均可通过规模至多为κ的检验完成学习。这完整解决了学习该图所需条件集最小规模的问题。就检验次数而言,条件集大小的上界表明,每个含n个顶点的图最多可通过n^κ次检验(条件集大小≤κ)学习。我们证明:对于条件集大小的任意上界q,存在含O(nq)个顶点的图至少需要n^{Ω(κ)}次检验才能学会。即使图的树宽和最大度均≤κ+2时,该下界仍然成立。从积极角度看,我们证明每个有界树宽的图可通过多项式次检验(条件集大小≤2κ)完成学习。