Learning decompositions of expensive-to-evaluate black-box functions promises to scale Bayesian optimisation (BO) to high-dimensional problems. However, the success of these techniques depends on finding proper decompositions that accurately represent the black-box. While previous works learn those decompositions based on data, we investigate data-independent decomposition sampling rules in this paper. We find that data-driven learners of decompositions can be easily misled towards local decompositions that do not hold globally across the search space. Then, we formally show that a random tree-based decomposition sampler exhibits favourable theoretical guarantees that effectively trade off maximal information gain and functional mismatch between the actual black-box and its surrogate as provided by the decomposition. Those results motivate the development of the random decomposition upper-confidence bound algorithm (RDUCB) that is straightforward to implement - (almost) plug-and-play - and, surprisingly, yields significant empirical gains compared to the previous state-of-the-art on a comprehensive set of benchmarks. We also confirm the plug-and-play nature of our modelling component by integrating our method with HEBO, showing improved practical gains in the highest dimensional tasks from Bayesmark.
翻译:学习昂贵黑箱函数的可分解结构有望将贝叶斯优化(BO)扩展到高维问题。然而,此类技术的成功取决于能否找到准确刻画黑箱特性的恰当分解。现有工作基于数据学习这种分解,而本文则研究与数据无关的分解采样规则。我们发现,数据驱动的分解学习方法易被误导至仅在搜索空间局部成立、全局无效的分解。随后,我们从理论上证明,基于随机树的分解采样器具有优越的理论保证,能有效权衡最大信息增益与分解导致的真实黑箱函数及其代理模型之间的函数失配。这些结论推动了随机分解上置信界算法(RDUCB)的提出——该算法实现简单(几乎可即插即用),且令人惊讶地在全面基准测试中相较现有最优方法取得了显著的实证提升。我们还将该方法与HEBO框架集成,证实了模型组件的即插即用特性,并在Bayesmark最高维度任务中展现了更优的实际性能增益。