Sequential optimization methods are often confronted with the curse of dimensionality in high-dimensional spaces. Current approaches under the Gaussian process framework are still burdened by the computational complexity of tracking Gaussian process posteriors and need to partition the optimization problem into small regions to ensure exploration or assume an underlying low-dimensional structure. With the idea of transiting the candidate points towards more promising positions, we propose a new method based on Markov Chain Monte Carlo to efficiently sample from an approximated posterior. We provide theoretical guarantees of its convergence in the Gaussian process Thompson sampling setting. We also show experimentally that both the Metropolis-Hastings and the Langevin Dynamics version of our algorithm outperform state-of-the-art methods in high-dimensional sequential optimization and reinforcement learning benchmarks.
翻译:序列优化方法在高维空间中常面临维数灾难。当前基于高斯过程框架的方法仍需承受追踪高斯过程后验的计算复杂度,需要将优化问题划分为小区域以确保探索或假设存在潜在的低维结构。基于将候选点向更优位置迁移的思想,我们提出了一种基于马尔可夫链蒙特卡洛的新方法,用于从近似后验中高效采样。我们给出了该方法在高斯过程汤普森采样设置下的收敛理论保证。实验表明,我们的Metropolis-Hastings和朗之万动力学版本算法在高维序列优化和强化学习基准测试中均优于当前最优方法。