In this paper, we propose novel Gaussian process-gated hierarchical mixtures of experts (GPHMEs) that are used for building gates and experts. Unlike in other mixtures of experts where the gating models are linear to the input, the gating functions of our model are inner nodes built with Gaussian processes based on random features that are non-linear and non-parametric. Further, the experts are also built with Gaussian processes and provide predictions that depend on test data. The optimization of the GPHMEs is carried out by variational inference. There are several advantages of the proposed GPHMEs. One is that they outperform tree-based HME benchmarks that partition the data in the input space. Another advantage is that they achieve good performance with reduced complexity. A third advantage of the GPHMEs is that they provide interpretability of deep Gaussian processes and more generally of deep Bayesian neural networks. Our GPHMEs demonstrate excellent performance for large-scale data sets even with quite modest sizes.
翻译:本文提出了新型的高斯过程门控层次专家混合模型(GPHMEs),用于构建门控函数与专家模型。不同于其他专家混合模型中门控模型与输入呈线性关系,本模型的门控函数是基于随机特征的高斯过程构建的非线性非参数内部节点。此外,专家模型同样由高斯过程构建,其预测结果依赖于测试数据。GPHMEs的优化采用变分推断方法。所提出的GPHMEs具有多项优势:其一,其在输入空间进行数据划分时性能优于基于树的HME基准模型;其二,以更低复杂度实现优异性能;其三,为深度高斯过程以及更广义的深度贝叶斯神经网络提供了可解释性。即使在中等规模的数据集上,我们的GPHMEs对大规模数据集也展现出卓越性能。