The graphical lasso is a widely used algorithm for fitting undirected Gaussian graphical models. However, for inference on functionals of edge values in the learned graph, standard tools lack formal statistical guarantees, such as control of the type I error rate. In this paper, we introduce a selective inference method for asymptotically valid inference after graphical lasso selection with added randomization. We obtain a selective likelihood, conditional on the event of selection, through a change of variable on the known density of the randomization variables. Our method enables interval estimation and hypothesis testing for a wide range of functionals of edge values in the learned graph using the conditional maximum likelihood estimate. Our numerical studies show that introducing a small amount of randomization: (i) greatly increases power and yields substantially shorter intervals compared to other conditional inference methods, including data splitting; (ii) ensures intervals of bounded length in high-dimensional settings where data splitting is infeasible due to insufficient samples for inference; (iii) enables inference for a wide range of inferential targets in the learned graph, including measures of node influence and connectivity between nodes.
翻译:图套索是拟合无向高斯图模型的一种广泛应用算法。然而,针对学习图中边值泛函的推断,标准工具缺乏正式的统计保证,例如对第一类错误率的控制。本文提出一种选择性推断方法,用于在图套索选择后(通过添加随机化)进行渐近有效的推断。我们通过对随机化变量的已知密度进行变量变换,获得了以选择事件为条件的似然函数。该方法能够利用条件最大似然估计,对学习图中边值的广泛泛函进行区间估计和假设检验。数值研究表明,引入少量随机化能够:(i)与其他条件推断方法(包括数据分割)相比,显著提升检验功效并产生更短的置信区间;(ii)在高维设置下(由于样本不足而无法进行数据分割)确保区间长度有界;(iii)实现对学习图中广泛推断目标(包括节点影响力度量与节点间连通性度量)的推断。