A Gaussian process is proposed as a model for the posterior distribution of the local predictive ability of a model or expert, conditional on a vec- tor of covariates, from historical predictions in the form of log predictive scores. Assuming Gaussian expert predictions and a Gaussian data generat- ing process, a linear transformation of the predictive score follows a noncen- tral chi-squared distribution with one degree of freedom. Motivated by this we develop a non-central chi-squared Gaussian process regression to flexibly model local predictive ability, with the posterior distribution of the latent GP function and kernel hyperparameters sampled by Hamiltonian Monte Carlo. We show that a cube-root transformation of the log scores is approximately Gaussian with homoscedastic variance, which makes it possible to estimate the model much faster by marginalizing the latent GP function analytically. Linear pools based on learned local predictive ability are applied to predict daily bike usage in Washington DC.
翻译:本文提出一种高斯过程模型,用于对模型或专家的局部预测能力的后验分布进行建模。该模型以历史预测(以对数预测得分形式呈现)为条件,并依赖协变量向量。假设专家预测服从高斯分布且数据生成过程为高斯过程,预测得分的线性变换服从自由度为1的非中心卡方分布。基于此,我们发展了一种非中心卡方高斯过程回归方法,以灵活建模局部预测能力,其中潜在高斯过程函数与核超参数的后验分布通过哈密顿蒙特卡洛采样得到。研究表明,对数得分的立方根变换近似服从等方差高斯分布,这使得通过解析方法边缘化潜在高斯过程函数成为可能,从而大幅提升模型估计速度。基于学习得到的局部预测能力的线性池被应用于预测华盛顿特区每日自行车使用量。