Benchmark Design in Black-Box Optimization (BBO) is a fundamental yet open-ended topic. Early BBO benchmarks are predominantly human-crafted, introducing expert bias and constraining diversity. Automating this design process can relieve the human-in-the-loop burden while enhancing diversity and objectivity. We propose Evolution of Benchmark (EoB), an automated BBO benchmark designer empowered by the large language model (LLM) and its program evolution capability. Specifically, we formulate benchmark design as a bi-objective optimization problem towards maximizing (i) landscape diversity and (ii) algorithm-differentiation ability across a portfolio of BBO solvers. Under this paradigm, EoB iteratively prompts LLM to evolve a population of benchmark programs and employs a reflection-based scheme to co-evolve the landscape and its corresponding program. Comprehensive experiments validate our EoB is a competitive candidate in multi-dimensional usages: 1) Benchmarking BBO algorithms; 2) Training and testing learning-assisted BBO algorithms; 3) Extending proxy for expensive real-world problems.
翻译:黑盒优化中的基准设计是一个基础但开放的研究课题。早期的黑盒优化基准主要由人工设计,这引入了专家偏见并限制了多样性。自动化这一设计过程能够减轻人工参与负担,同时提升多样性与客观性。我们提出基准测试的进化,一种由大语言模型及其程序进化能力驱动的自动化黑盒优化基准设计方法。具体而言,我们将基准设计形式化为一个双目标优化问题,旨在最大化(i)优化景观的多样性,以及(ii)在一组黑盒优化求解器上的算法区分能力。在此范式下,基准测试的进化迭代地提示大语言模型进化一组基准程序,并采用基于反思的机制协同进化优化景观及其对应程序。综合实验验证了我们的基准测试的进化在多维应用场景中是一个具有竞争力的候选方案:1)黑盒优化算法的基准测试;2)学习辅助黑盒优化算法的训练与测试;3)作为昂贵现实世界问题的代理扩展。