Gaussian graphical models emerge in a wide range of fields. They model the statistical relationships between variables as a graph, where an edge between two variables indicates conditional dependence. Unfortunately, well-established estimators, such as the graphical lasso or neighborhood selection, are known to be susceptible to a high prevalence of false edge detections. False detections may encourage inaccurate or even incorrect scientific interpretations, with major implications in applications, such as biomedicine or healthcare. In this paper, we introduce a nodewise variable selection approach to graph learning and provably control the false discovery rate of the selected edge set at a self-estimated level. A novel fusion method of the individual neighborhoods outputs an undirected graph estimate. The proposed method is parameter-free and does not require tuning by the user. Benchmarks against competing false discovery rate controlling methods in numerical experiments considering different graph topologies show a significant gain in performance.
翻译:高斯图模型广泛应用于众多领域。该类模型将变量间的统计关系建模为图结构,其中两个变量间的边表征条件依赖性。然而,现有成熟估计方法(如图套索或邻域选择)已知易产生大量虚假边检测。虚假检测可能引发不准确乃至错误的科学解释,在生物医学或医疗等应用领域将产生重大影响。本文针对图学习问题提出一种节点级变量选择方法,该方法能以自估计水平对所选边集进行可证明的错误发现率控制。通过个体邻域的新型融合方法,可输出无向图估计结果。所提方法无需参数设置和用户调试。在考虑不同图拓扑结构的数值实验中,对比现有错误发现率控制方法,本方法展现出显著的性能优势。