In this paper, we propose novel Gaussian process-gated hierarchical mixtures of experts (GPHMEs). Unlike other mixtures of experts with gating models linear in the input, our model employs gating functions built with Gaussian processes (GPs). These processes are based on random features that are non-linear functions of the inputs. Furthermore, the experts in our model are also constructed with GPs. The optimization of the GPHMEs is performed by variational inference. The proposed GPHMEs have several advantages. They outperform tree-based HME benchmarks that partition the data in the input space, and they achieve good performance with reduced complexity. Another advantage is the interpretability they provide for deep GPs, and more generally, for deep Bayesian neural networks. Our GPHMEs demonstrate excellent performance for large-scale data sets, even with quite modest sizes.
翻译:本文提出了一种新颖的高斯过程门控层级专家混合模型(GPHMEs)。不同于门控模型为输入线性函数的其他专家混合模型,我们的模型采用基于高斯过程(GPs)构建的门控函数。这些过程基于输入的随机特征——此类特征为输入的非线性函数。此外,模型中的专家同样通过高斯过程构建。GPHMEs的优化通过变分推断实现。所提出的GPHMEs具有多项优势:它们优于在输入空间中对数据进行划分的基于树的层级专家混合基准模型,并以更低的复杂度实现良好性能。另一优势在于其能为深度高斯过程(更广泛而言,深度贝叶斯神经网络)提供可解释性。我们的GPHMEs即便在规模适中时,也能在大型数据集上展现出卓越性能。