Bayesian Optimization (BO) for the minimization of expensive functions of continuous variables uses all the knowledge acquired from previous samples (${\boldsymbol x}_i$ and $f({\boldsymbol x}_i)$ values) to build a surrogate model based on Gaussian processes. The surrogate is then exploited to define the next point to sample, through a careful balance of exploration and exploitation. Initially intended for low-dimensional spaces, BO has recently been modified and used also for very large-dimensional spaces (up to about one thousand dimensions). In this paper we consider a much simpler algorithm, called "Reactive Affine Shaker" (RAS). The next sample is always generated with a uniform probability distribution inside a parallelepiped (the "box"). At each iteration, the form of the box is adapted during the search through an affine transformation, based only on the point $\boldsymbol x$ position and on the success or failure in improving the function. The function values are therefore not used directly to modify the search area and to generate the next sample. The entire dimensionality is kept (no active subspaces). Despite its extreme simplicity and its use of only stochastic local search, surprisingly the produced results are comparable to and not too far from the state-of-the-art results of high-dimensional versions of BO, although with some more function evaluations. An ablation study and an analysis of probability distribution of directions (improving steps and prevailing box orientation) in very large-dimensional spaces are conducted to understand more about the behavior of RAS and to assess the relative importance of the algorithmic building blocks for the final results.
翻译:贝叶斯优化(BO)用于最小化连续变量代价函数时,会利用先前所有采样点(${\boldsymbol x}_i$ 与 $f({\boldsymbol x}_i)$ 值)所获得的知识,构建基于高斯过程的代理模型。随后通过探索与利用的谨慎平衡,利用该代理模型确定下一个采样点。BO最初针对低维空间设计,近年来经过改进已能应用于极高维空间(约一千维)。本文研究一种更为简单的算法——"反应性仿射振荡算法"(RAS)。该算法始终在平行六面体("盒子")内以均匀概率分布生成下一个采样点。每次迭代时,仅根据点$\boldsymbol x$的位置及函数改进的成功/失败情况,通过仿射变换调整盒子的形态。因此算法并不直接利用函数值来修改搜索区域或生成后续样本,且保持完整的维度特性(未采用主动子空间方法)。尽管该算法具有极端简洁性且仅使用随机局部搜索,其在高维优化问题中产生的结果却能与最先进的高维BO版本相媲美(虽需更多函数评估次数)。本文通过消融实验及超高维空间中方向概率分布(改进步长与主导盒子取向)的分析,深入探究RAS算法的行为机理,并评估各算法模块对最终结果的相对重要性。