Gaussian Graphical models (GGM) are widely used to estimate the network structures in many applications ranging from biology to finance. In practice, data is often corrupted by latent confounders which biases inference of the underlying true graphical structure. In this paper, we compare and contrast two strategies for inference in graphical models with latent confounders: Gaussian graphical models with latent variables (LVGGM) and PCA-based removal of confounding (PCA+GGM). While these two approaches have similar goals, they are motivated by different assumptions about confounding. In this paper, we explore the connection between these two approaches and propose a new method, which combines the strengths of these two approaches. We prove the consistency and convergence rate for the PCA-based method and use these results to provide guidance about when to use each method. We demonstrate the effectiveness of our methodology using both simulations and in two real-world applications.
翻译:高斯图模型(GGM)被广泛应用于从生物学到金融学等众多领域中网络结构的估计。在实际应用中,数据常受到隐混杂变量的污染,导致对真实图结构的推断产生偏差。本文比较和对比了两种针对含隐混杂变量的图模型推断策略:含潜变量的高斯图模型(LVGGM)和基于PCA的混杂去除方法(PCA+GGM)。尽管这两种方法目标相似,但它们的出发点是基于对混杂作用的不同假设。本文探讨了这两种方法之间的联系,并提出了一种融合两者优势的新方法。我们证明了基于PCA方法的一致性和收敛速度,并利用这些结果为何时使用每种方法提供指导。通过模拟实验和两个实际应用案例,我们验证了所提方法的有效性。