The Linear Model of Co-regionalization (LMC) is a very general model of multitask gaussian process for regression or classification. While its expressivity and conceptual simplicity are appealing, naive implementations have cubic complexity in the number of datapoints and number of tasks, making approximations mandatory for most applications. However, recent work has shown that under some conditions the latent processes of the model can be decoupled, leading to a complexity that is only linear in the number of said processes. We here extend these results, showing from the most general assumptions that the only condition necessary to an efficient exact computation of the LMC is a mild hypothesis on the noise model. We introduce a full parametrization of the resulting \emph{projected LMC} model, and an expression of the marginal likelihood enabling efficient optimization. We perform a parametric study on synthetic data to show the excellent performance of our approach, compared to an unrestricted exact LMC and approximations of the latter. Overall, the projected LMC appears as a credible and simpler alternative to state-of-the art models, which greatly facilitates some computations such as leave-one-out cross-validation and fantasization.
翻译:协同区域化线性模型(LMC)是一种用于回归或分类的多任务高斯过程通用模型。尽管其表达能力和概念简洁性颇具吸引力,但朴素实现在数据点数量和任务数量上存在三次方复杂度,使得大多数应用必须采用近似方法。然而,近期研究表明,在某些条件下该模型的潜在过程可以实现解耦,从而将复杂度降低至仅与这些过程数量呈线性关系。我们在此扩展这些结果,从最普遍的假设出发证明:实现LMC高效精确计算的唯一必要条件是对噪声模型的温和假设。我们提出了所得"投影LMC"模型的完整参数化方案,以及可高效优化的边际似然表达式。通过在合成数据上进行参数研究,我们的方法相较于无约束精确LMC及其近似方法均展现出卓越性能。总体而言,投影LMC作为一种比现有最优模型更可信且更简洁的替代方案,能大幅简化留一交叉验证和幻想值计算等操作。