We develop optimal algorithms for learning undirected Gaussian trees and directed Gaussian polytrees from data. We consider both problems of distribution learning (i.e. in KL distance) and structure learning (i.e. exact recovery). The first approach is based on the Chow-Liu algorithm, and learns an optimal tree-structured distribution efficiently. The second approach is a modification of the PC algorithm for polytrees that uses partial correlation as a conditional independence tester for constraint-based structure learning. We derive explicit finite-sample guarantees for both approaches, and show that both approaches are optimal by deriving matching lower bounds. Additionally, we conduct numerical experiments to compare the performance of various algorithms, providing further insights and empirical evidence.
翻译:我们开发了从数据中学习无向高斯树和有向高斯多叉树的最优算法。我们同时考虑了分布学习(即KL距离)和结构学习(即精确恢复)这两个问题。第一种方法基于Chow-Liu算法,能够高效学习最优树结构分布。第二种方法是对多叉树的PC算法进行改进,利用偏相关系数作为条件独立性检验工具进行基于约束的结构学习。我们为这两种方法推导了明确的有限样本保证,并通过建立匹配的下界证明了二者的最优性。此外,我们进行了数值实验比较各种算法的性能,提供了进一步的见解和实证证据。