Bayesian network (BN) structure discovery algorithms typically either make assumptions about the sparsity of the true underlying network, or are limited by computational constraints to networks with a small number of variables. While these sparsity assumptions can take various forms, frequently the assumptions focus on an upper bound for the maximum in-degree of the underlying graph $\nabla_G$. Theorem 2 in Duttweiler et. al. (2023) demonstrates that the largest eigenvalue of the normalized inverse covariance matrix ($\Omega$) of a linear BN is a lower bound for $\nabla_G$. Building on this result, this paper provides the asymptotic properties of, and a debiasing procedure for, the sample eigenvalues of $\Omega$, leading to a hypothesis test that may be used to determine if the BN has max in-degree greater than 1. A linear BN structure discovery workflow is suggested in which the investigator uses this hypothesis test to aid in selecting an appropriate structure discovery algorithm. The hypothesis test performance is evaluated through simulations and the workflow is demonstrated on data from a human psoriasis study.
翻译:贝叶斯网络(BN)结构发现算法通常要么对真实底层网络的稀疏性做出假设,要么因计算限制而局限于少量变量的网络。尽管这些稀疏性假设形式多样,但通常集中于对底层图$\nabla_G$最大入度上界的假设。Duttweiler等人(2023)的定理2表明,线性BN的归一化逆协方差矩阵($\Omega$)的最大特征值是$\nabla_G$的下界。基于该结果,本文提供了$\Omega$样本特征值的渐近性质及去偏方法,进而构建了一个可用于判断BN最大入度是否大于1的假设检验。研究建议了一种线性BN结构发现工作流程,其中研究者利用该假设检验辅助选择合适的结构发现算法。通过模拟实验评估了假设检验的性能,并在人类银屑病研究数据上展示了该工作流程。