We consider the problem of optimizing expensive black-box functions over high-dimensional combinatorial spaces which arises in many science, engineering, and ML applications. We use Bayesian Optimization (BO) and propose a novel surrogate modeling approach for efficiently handling a large number of binary and categorical parameters. The key idea is to select a number of discrete structures from the input space (the dictionary) and use them to define an ordinal embedding for high-dimensional combinatorial structures. This allows us to use existing Gaussian process models for continuous spaces. We develop a principled approach based on binary wavelets to construct dictionaries for binary spaces, and propose a randomized construction method that generalizes to categorical spaces. We provide theoretical justification to support the effectiveness of the dictionary-based embeddings. Our experiments on diverse real-world benchmarks demonstrate the effectiveness of our proposed surrogate modeling approach over state-of-the-art BO methods.
翻译:我们研究了在高维组合空间中对昂贵黑盒函数进行优化的问题,该问题广泛存在于科学、工程及机器学习应用中。我们采用贝叶斯优化方法,并提出了一种新颖的代理建模策略,以高效处理大量二元和类别型参数。核心思想是从输入空间中选择若干离散结构(即字典),并利用它们为高维组合结构定义有序嵌入,从而允许我们使用面向连续空间的高斯过程模型。我们基于二元小波开发了一种系统方法,用于构建二元空间的字典,并进一步提出一种随机化构建方法,可推广至类别空间。我们提供了理论论证以支持基于字典嵌入的有效性。在多个真实世界基准测试上的实验表明,我们提出的代理建模方法相较于最先进的贝叶斯优化方法具有显著优势。