In many statistical applications, the dimension is too large to handle for standard high-dimensional machine learning procedures. This is particularly true for graphical models, where the interpretation of a large graph is difficult and learning its structure is often computationally impossible either because the underlying graph is not sufficiently sparse or the number of vertices is too large. To address this issue, we develop a procedure to test a property of a graph underlying a graphical model that requires only a subquadratic number of correlation queries (i.e., we require that the algorithm only can access a tiny fraction of the covariance matrix). This provides a conceptually simple test to determine whether the underlying graph is a tree or, more generally, if it has a small separation number, a quantity closely related to the treewidth of the graph. The proposed method is a divide-and-conquer algorithm that can be applied to quite general graphical models.
翻译:在许多统计应用中,维度太大以至于标准的高维机器学习方法难以处理。对于图模型而言尤其如此,因为大图的解释困难且其结构学习往往在计算上不可行——要么由于底层图不够稀疏,要么由于顶点数量过多。为解决这一问题,我们开发了一种测试图模型底层图性质的程序,该程序仅需次二次数量的相关查询(即,算法仅能访问协方差矩阵的极小部分)。这提供了一种概念上简单的测试方法,用于判断底层图是否为树,或更一般地,是否具有小分离数——一个与图树宽密切相关的量。所提出的方法是一种分治算法,可应用于相当广泛的图模型。