The Linear Model of Co-regionalization (LMC) is a very general model of multitask gaussian process for regression or classification. While its expressivity and conceptual simplicity are appealing, naive implementations have cubic complexity in the number of datapoints and number of tasks, making approximations mandatory for most applications. However, recent work has shown that under some conditions the latent processes of the model can be decoupled, leading to a complexity that is only linear in the number of said processes. We here extend these results, showing from the most general assumptions that the only condition necessary to an efficient exact computation of the LMC is a mild hypothesis on the noise model. We introduce a full parametrization of the resulting \emph{projected LMC} model, and an expression of the marginal likelihood enabling efficient optimization. We perform a parametric study on synthetic data to show the excellent performance of our approach, compared to an unrestricted exact LMC and approximations of the latter. Overall, the projected LMC appears as a credible and simpler alternative to state-of-the art models, which greatly facilitates some computations such as leave-one-out cross-validation and fantasization.
翻译:区域化线性模型(LMC)是一种用于回归或分类的多任务高斯过程的通用模型。尽管其表达能力和概念简洁性颇具吸引力,但朴素实现的复杂度与数据点数量和任务数量的三次方成正比,使得大多数应用不得不采用近似方法。然而,近期研究表明,在某些条件下,该模型的潜在过程可以解耦,从而将复杂度降低为仅与潜在过程数量线性相关。本文进一步扩展了这些结果,从最一般假设出发证明:实现LMC高效精确计算的唯一必要条件是对噪声模型的一个温和假设。我们引入所提出的\emph{投影LMC}模型的完整参数化方案,并给出了支持高效优化的边际似然表达式。通过在合成数据上进行参数研究,我们展示了该方法相较于无约束精确LMC及其近似方法的卓越性能。总体而言,投影LMC作为现有最先进模型的更简洁、可靠的替代方案,可大幅简化留一交叉验证和幻想生成等计算过程。