We present a new strategy for learning the functional relation between a pair of variables, while addressing inhomogeneities in the correlation structure of the available data, by modelling the sought function as a sample function of a non-stationary Gaussian Process (GP), that nests within itself multiple other GPs, each of which we prove can be stationary, thereby establishing sufficiency of two GP layers. In fact, a non-stationary kernel is envisaged, with each hyperparameter set as dependent on the sample function drawn from the outer non-stationary GP, such that a new sample function is drawn at every pair of input values at which the kernel is computed. However, such a model cannot be implemented, and we substitute this by recalling that the average effect of drawing different sample functions from a given GP is equivalent to that of drawing a sample function from each of a set of GPs that are rendered different, as updated during the equilibrium stage of the undertaken inference (via MCMC). The kernel is fully non-parametric, and it suffices to learn one hyperparameter per layer of GP, for each dimension of the input variable. We illustrate this new learning strategy on a real dataset.
翻译:我们提出了一种学习变量间函数关系的新策略,通过将待求函数建模为非平稳高斯过程的样本函数来处理数据相关结构中的非均匀性。该非平稳高斯过程内嵌多个其他高斯过程,我们证明了每个内嵌高斯过程均可保持平稳性,从而确立了两层高斯过程的充分性。具体而言,我们构思了一个非平稳核函数,其每个超参数都依赖于从外层非平稳高斯过程中抽取的样本函数,使得在计算核函数的每对输入值处都会抽取一个新的样本函数。然而,该模型无法直接实现。为此,我们通过以下方式替代:回顾从给定高斯过程抽取不同样本函数的平均效应,等价于从一组不同的高斯过程中各抽取一个样本函数的效果差异——这些高斯过程在推理(通过马尔可夫链蒙特卡洛法)的平衡阶段会随迭代更新而改变。该核函数完全非参数化,对于输入变量的每个维度,只需为每层高斯过程学习一个超参数即可。我们在真实数据集上展示了这一新学习策略。