Neural Operator Variational Inference based on Regularized Stein Discrepancy for Deep Gaussian Processes

Deep Gaussian Process (DGP) models offer a powerful nonparametric approach for Bayesian inference, but exact inference is typically intractable, motivating the use of various approximations. However, existing approaches, such as mean-field Gaussian assumptions, limit the expressiveness and efficacy of DGP models, while stochastic approximation can be computationally expensive. To tackle these challenges, we introduce Neural Operator Variational Inference (NOVI) for Deep Gaussian Processes. NOVI uses a neural generator to obtain a sampler and minimizes the Regularized Stein Discrepancy in L2 space between the generated distribution and true posterior. We solve the minimax problem using Monte Carlo estimation and subsampling stochastic optimization techniques. We demonstrate that the bias introduced by our method can be controlled by multiplying the Fisher divergence with a constant, which leads to robust error control and ensures the stability and precision of the algorithm. Our experiments on datasets ranging from hundreds to tens of thousands demonstrate the effectiveness and the faster convergence rate of the proposed method. We achieve a classification accuracy of 93.56 on the CIFAR10 dataset, outperforming SOTA Gaussian process methods. Furthermore, our method guarantees theoretically controlled prediction error for DGP models and demonstrates remarkable performance on various datasets. We are optimistic that NOVI has the potential to enhance the performance of deep Bayesian nonparametric models and could have significant implications for various practical applications

翻译：深度高斯过程（DGP）模型为贝叶斯推断提供了一种强大的非参数方法，但精确推断通常难以实现，因此推动了各种近似方法的使用。然而，现有方法（如均值场高斯假设）限制了DGP模型的表现力和有效性，而随机近似方法则可能计算成本高昂。为应对这些挑战，我们提出了针对深度高斯过程的神经算子变分推断（NOVI）。NOVI利用神经生成器获取采样器，并在L2空间中最小化生成分布与真实后验分布之间的正则化Stein散度。我们通过蒙特卡洛估计和子采样随机优化技术求解极小极大问题。我们证明了该方法引入的偏差可通过将Fisher散度乘以常数加以控制，从而实现了稳健的误差控制，并确保了算法的稳定性和精度。在从数百到数万个样本的数据集上的实验表明，该方法具有有效性和更快的收敛速度。我们在CIFAR10数据集上实现了93.56的分类准确率，优于最先进的高斯过程方法。此外，我们的方法为DGP模型提供了理论上可控的预测误差保证，并在多种数据集上表现出卓越性能。我们乐观地认为，NOVI有望提升深度贝叶斯非参数模型的性能，并对各类实际应用产生重要影响。