In the last two decades, the linear model of coregionalization (LMC) has been widely used to model multivariate spatial processes. However, it can be a challenging task to conduct likelihood-based inference for such models because of the cubic cost associated with Gaussian likelihood evaluations. Starting from an analogy with matrix normal models, we propose a reformulation of the LMC likelihood that highlights the linear, rather than cubic, computational complexity as a function of the dimension of the response vector. We describe how those simplifications can be exploited in Gaussian hierarchical models. In addition, we propose a new sparsity-inducing approach to the LMC that introduces structural zeros in the coregionalization matrix in an attempt to reduce the number of parameters in a principled and data-driven way. Our reformulation of the LMC likelihood ensures that our sparse approach comes at virtually no additional cost when included in a Markov chain Monte Carlo (MCMC) algorithm. It is shown, on synthetic data, to significantly improve predictive performance. We also apply our methodology to a dataset comprised of air pollutant measurements from the state of California. We investigate the strength of the correlation among the measurements by providing new insights from our sparse method.
翻译:在过去的二十年中,协同区域化线性模型(LMC)已被广泛用于建模多元空间过程。然而,由于高斯似然评估涉及立方级计算成本,对此类模型进行基于似然的推断可能是一项具有挑战性的任务。从与矩阵正态模型的类比出发,我们提出了一种LMC似然的重新表述,该表述凸显了其计算复杂度随响应向量维度呈线性而非立方级增长的特性。我们阐述了如何在高斯层次模型中利用这些简化。此外,我们提出了一种新的LMC稀疏诱导方法,该方法在协同区域化矩阵中引入结构化零值,旨在以原则性和数据驱动的方式减少参数数量。我们对LMC似然的重新表述确保,当将稀疏方法纳入马尔可夫链蒙特卡洛(MCMC)算法时,几乎不会产生额外计算成本。在合成数据上的实验表明,该方法能显著提升预测性能。我们还将所提方法应用于包含加利福尼亚州空气污染物测量的数据集,通过稀疏方法获得的新见解,深入探究了测量值间相关性的强度。