Mixtures of experts have become an indispensable tool for flexible modelling in a supervised learning context, allowing not only the mean function but the entire density of the output to change with the inputs. Sparse Gaussian processes (GP) have shown promise as a leading candidate for the experts in such models, and in this article, we propose to design the gating network for selecting the experts from such mixtures of sparse GPs using a deep neural network (DNN). Furthermore, a fast one pass algorithm called Cluster-Classify-Regress (CCR) is leveraged to approximate the maximum a posteriori (MAP) estimator extremely quickly. This powerful combination of model and algorithm together delivers a novel method which is flexible, robust, and extremely efficient. In particular, the method is able to outperform competing methods in terms of accuracy and uncertainty quantification. The cost is competitive on low-dimensional and small data sets, but is significantly lower for higher-dimensional and big data sets. Iteratively maximizing the distribution of experts given allocations and allocations given experts does not provide significant improvement, which indicates that the algorithm achieves a good approximation to the local MAP estimator very fast. This insight can be useful also in the context of other mixture of experts models.
翻译:专家混合模型已成为监督学习中灵活建模不可或缺的工具,其不仅允许均值函数,还能使输出的整个密度随输入变化。稀疏高斯过程(GP)在作为此类模型中专家候选方法方面展现出前景,本文提出利用深度神经网络(DNN)设计门控网络,以从这些稀疏高斯过程混合模型中选取专家。此外,采用一种称为聚类-分类-回归(CCR)的快速单次通过算法,极快地逼近最大后验(MAP)估计量。这种模型与算法的强大结合产生了一种灵活、鲁棒且极其高效的新方法。具体而言,该方法在准确性和不确定性量化方面能够优于竞争方法。在低维和小规模数据集上其成本具有竞争力,但在更高维和大规模数据集上成本显著更低。在给定分配下迭代最大化专家分布以及给定专家下迭代最大化分配并未带来显著改进,这表明该算法能极快地实现对局部MAP估计量的良好近似。这一见解也可应用于其他专家混合模型。