Stability, akin to reproducibility, is crucial in statistical analysis. This paper examines the stability of sparse network inference in high-dimensional graphical models, where selected edges should remain consistent across different samples. Our study focuses on the Graphical Lasso and its decomposition into two steps, with the first step involving hierarchical clustering using single linkage.We provide theoretical proof that single linkage is stable, evidenced by controlled distances between two dendrograms inferred from two samples. Practical experiments further illustrate the stability of the Graphical Lasso's various steps, including dendrograms, variable clusters, and final networks. Our results, validated through both theoretical analysis and practical experiments using simulated and real datasets, demonstrate that single linkage is more stable than other methods when a modular structure is present.
翻译:稳定性,类似于可重复性,在统计分析中至关重要。本文研究了高维图模型中稀疏网络推断的稳定性问题,其中所选边应在不同样本间保持一致。我们的研究聚焦于Graphical Lasso及其分解为两个步骤的过程,其中第一步涉及使用单连接的层次聚类。我们提供了理论证明,表明单连接方法是稳定的,其证据来源于从两个样本推断出的两个树状图之间受控的距离。实际实验进一步说明了Graphical Lasso各步骤(包括树状图、变量聚类和最终网络)的稳定性。我们通过理论分析以及使用模拟和真实数据集的实验验证了结果,证明当存在模块化结构时,单连接方法比其他方法更稳定。