The existence of latent variables in practical problems is common, for example when some variables are difficult or expensive to measure, or simply unknown. When latent variables are unaccounted for, structure learning for Gaussian graphical models can be blurred by additional correlation between the observed variables that is incurred by the latent variables. A standard approach for this problem is a latent version of the graphical lasso that splits the inverse covariance matrix into a sparse and a low-rank part that are penalized separately. In this paper we propose a generalization of this via the flexible Golazo penalty. This allows us to introduce latent versions of for example the adaptive lasso, positive dependence constraints or predetermined sparsity patterns, and combinations of those. We develop an algorithm for the latent Gaussian graphical model with the Golazo penalty and demonstrate it on simulated and real data.
翻译:在实际问题中,隐变量的存在是普遍的,例如当某些变量难以测量、测量成本高昂或完全未知时。若未考虑隐变量,高斯图模型的结构学习可能会因隐变量引发的观测变量间额外相关性而变得模糊。针对该问题的标准方法是图套索的隐变量版本,该方法将逆协方差矩阵分解为稀疏部分和低秩部分并分别进行惩罚。本文通过灵活的Golazo惩罚提出了该方法的泛化形式。这使得我们能够引入多种隐变量模型变体,例如自适应套索、正相关性约束或预设稀疏模式及其组合。我们针对采用Golazo惩罚的隐变量高斯图模型开发了相应算法,并在模拟数据和真实数据上进行了验证。