Many real-world tasks require optimizing expensive black-box functions accessible only through noisy evaluations, a setting commonly addressed with Bayesian optimization (BO). While Bayesian neural networks (BNNs) have recently emerged as scalable alternatives to Gaussian Processes (GPs), traditional BNN-BO frameworks remain burdened by expensive posterior sampling and acquisition function optimization. In this work, we propose {VBO-MI} (Variational Bayesian Optimization with Mutual Information), a fully gradient-based BO framework that leverages recent advances in variational mutual information estimation. To enable end-to-end gradient flow, we employ an actor-critic architecture consisting of an {action-net} to navigate the input space and a {variational critic} to estimate information gain. This formulation effectively eliminates the traditional inner-loop acquisition optimization bottleneck, achieving up to a {$10^2 \times$ reduction in FLOPs} compared to BNN-BO baselines. We evaluate our method on a diverse suite of benchmarks, including high-dimensional synthetic functions and complex real-world tasks such as PDE optimization, the Lunar Lander control problem, and categorical Pest Control. Our experiments demonstrate that VBO-MI consistently provides the same or superior optimization performance and computational scalability over the baselines.
翻译:许多现实世界任务需要优化仅能通过噪声评估访问的昂贵黑盒函数,这一设定通常通过贝叶斯优化(BO)来解决。虽然贝叶斯神经网络(BNNs)最近作为高斯过程(GPs)的可扩展替代方案出现,但传统的BNN-BO框架仍受限于昂贵的后验采样和采集函数优化。在本工作中,我们提出了{VBO-MI}(基于互信息的变分贝叶斯优化),这是一个完全基于梯度的BO框架,利用了变分互信息估计的最新进展。为了实现端到端的梯度流,我们采用了一个由用于导航输入空间的{动作网络}和用于估计信息增益的{变分评论家}组成的演员-评论家架构。该表述有效地消除了传统的内循环采集优化瓶颈,与BNN-BO基线相比,实现了高达{$10^2 \times$的FLOPs减少}。我们在多样化的基准测试套件上评估了我们的方法,包括高维合成函数以及复杂的现实世界任务,如偏微分方程优化、月球着陆器控制问题和分类害虫控制。我们的实验表明,与基线相比,VBO-MI始终提供相同或更优的优化性能和计算可扩展性。