Existing Bayesian Optimization (BO) methods typically balance exploration and exploitation to optimize costly objective functions. However, these methods often suffer from a significant one-step bias, which may lead to convergence towards local optima and poor performance in complex or high-dimensional tasks. Recently, Black-Box Optimization (BBO) has achieved success across various scientific and engineering domains, particularly when function evaluations are costly and gradients are unavailable. Motivated by this, we propose the Reinforced Energy-Based Model for Bayesian Optimization (REBMBO), which integrates Gaussian Processes (GP) for local guidance with an Energy-Based Model (EBM) to capture global structural information. Notably, we define each Bayesian Optimization iteration as a Markov Decision Process (MDP) and use Proximal Policy Optimization (PPO) for adaptive multi-step lookahead, dynamically adjusting the depth and direction of exploration to effectively overcome the limitations of traditional BO methods. We conduct extensive experiments on synthetic and real-world benchmarks, confirming the superior performance of REBMBO. Additional analyses across various GP configurations further highlight its adaptability and robustness.
翻译:现有贝叶斯优化方法通常通过权衡探索与利用来优化代价高昂的目标函数。然而,这些方法常受显著的单步偏差影响,可能导致收敛至局部最优解,并在复杂或高维任务中表现不佳。近年来,黑盒优化方法在众多科学与工程领域取得成功,尤其在函数评估代价高昂且梯度不可得的情况下表现突出。受此启发,我们提出基于强化学习的能量模型贝叶斯优化方法,该方法将用于局部引导的高斯过程与捕捉全局结构信息的能量模型相结合。值得注意的是,我们将每次贝叶斯优化迭代定义为马尔可夫决策过程,并采用近端策略优化算法实现自适应多步前瞻,动态调整探索的深度与方向,从而有效克服传统贝叶斯优化方法的局限。我们在合成与真实基准测试上进行了广泛实验,验证了所提方法的优越性能。针对不同高斯过程配置的补充分析进一步凸显了其适应性与鲁棒性。