Generalized linear mixed models (GLMMs) are a widely used tool in statistical analysis. The main bottleneck of many computational approaches lies in the inversion of the high dimensional precision matrices associated with the random effects. Such matrices are typically sparse; however, the sparsity pattern resembles a multi partite random graph, which does not lend itself well to default sparse linear algebra techniques. Notably, we show that, for typical GLMMs, the Cholesky factor is dense even when the original precision is sparse. We thus turn to approximate iterative techniques, in particular to the conjugate gradient (CG) method. We combine a detailed analysis of the spectrum of said precision matrices with results from random graph theory to show that CG-based methods applied to high-dimensional GLMMs typically achieve a fixed approximation error with a total cost that scales linearly with the number of parameters and observations. Numerical illustrations with both real and simulated data confirm the theoretical findings, while at the same time illustrating situations, such as nested structures, where CG-based methods struggle.
翻译:广义线性混合模型(GLMMs)是统计分析中广泛使用的工具。许多计算方法的主要瓶颈在于与随机效应相关的高维精度矩阵的求逆。这类矩阵通常是稀疏的;然而,其稀疏模式类似于多部随机图,这使得默认的稀疏线性代数技术难以有效应用。值得注意的是,我们证明对于典型的GLMMs,即使原始精度矩阵是稀疏的,其Cholesky因子也是稠密的。因此,我们转向近似迭代技术,特别是共轭梯度(CG)方法。我们结合对所述精度矩阵谱的详细分析与随机图理论的结果,证明了基于CG的方法应用于高维GLMMs时,通常能以总成本随参数和观测数量线性增长的方式达到固定的近似误差。使用真实数据和模拟数据的数值示例验证了理论结果,同时也说明了基于CG的方法在某些情况下(如嵌套结构)面临的困难。