We present a computationally-efficient strategy to initialise the hyperparameters of a Gaussian process (GP) avoiding the computation of the likelihood function. Our strategy can be used as a pretraining stage to find initial conditions for maximum-likelihood (ML) training, or as a standalone method to compute hyperparameters values to be plugged in directly into the GP model. Motivated by the fact that training a GP via ML is equivalent (on average) to minimising the KL-divergence between the true and learnt model, we set to explore different metrics/divergences among GPs that are computationally inexpensive and provide hyperparameter values that are close to those found via ML. In practice, we identify the GP hyperparameters by projecting the empirical covariance or (Fourier) power spectrum onto a parametric family, thus proposing and studying various measures of discrepancy operating on the temporal and frequency domains. Our contribution extends the variogram method developed by the geostatistics literature and, accordingly, it is referred to as the generalised variogram method (GVM). In addition to the theoretical presentation of GVM, we provide experimental validation in terms of accuracy, consistency with ML and computational complexity for different kernels using synthetic and real-world data.
翻译:我们提出了一种计算高效的高斯过程超参数初始化策略,该方法无需计算似然函数。该策略既可作为预训练阶段为最大似然训练提供初始条件,也可作为独立方法直接计算可嵌入高斯过程模型的超参数值。基于通过最大似然训练高斯过程等价于(平均意义上)最小化真实模型与学习模型之间KL散度的理论动机,我们探索了在高斯过程间计算代价较低且能获得接近最大似然超参数值的不同度量/散度方法。具体而言,我们通过将经验协方差或(傅里叶)功率谱投影到参数化族来实现高斯过程超参数辨识,由此提出并研究在时域和频域中运作的各种差异度量。本研究扩展了地质统计学文献中发展的变差函数方法,故称之为广义变差函数方法。除理论阐述外,我们还针对不同核函数使用合成数据与真实数据进行了精度、与最大似然的一致性及计算复杂度的实验验证。