Automatic Design of Optimization Test Problems with Large Language Models

The development of black-box optimization algorithms depends on the availability of benchmark suites that are both diverse and representative of real-world problem landscapes. Widely used collections such as BBOB and CEC remain dominated by hand-crafted synthetic functions and provide limited coverage of the high-dimensional space of Exploratory Landscape Analysis (ELA) features, which in turn biases evaluation and hinders training of meta-black-box optimizers. We introduce Evolution of Test Functions (EoTF), a framework that automatically generates continuous optimization test functions whose landscapes match a specified target ELA feature vector. EoTF adapts LLM-driven evolutionary search, originally proposed for heuristic discovery, to evolve interpretable, self-contained numpy implementations of objective functions by minimizing the distance between sampled ELA features of generated candidates and a target profile. In experiments on 24 noiseless BBOB functions and a contamination-mitigating suite of 24 MA-BBOB hybrid functions, EoTF reliably produces non-trivial functions with closely matching ELA characteristics and preserves optimizer performance rankings under fixed evaluation budgets, supporting their validity as surrogate benchmarks. While a baseline neural-network-based generator achieves higher accuracy in 2D, EoTF substantially outperforms it in 3D and exhibits stable solution quality as dimensionality increases, highlighting favorable scalability. Overall, EoTF offers a practical route to scalable, portable, and interpretable benchmark generation targeted to desired landscape properties.

翻译：黑盒优化算法的发展依赖于兼具多样性和真实问题景观代表性的基准测试套件。当前广泛使用的BBOB和CEC等基准集仍以人工设计的合成函数为主，对探索性景观分析（ELA）特征高维空间的覆盖有限，这种偏差会影响算法评估并阻碍元黑盒优化器的训练。本文提出测试函数演化框架（EoTF），该框架能自动生成景观特征与指定目标ELA特征向量匹配的连续优化测试函数。EoTF将最初为启发式发现提出的LLM驱动进化搜索进行适配，通过最小化生成候选函数的采样ELA特征与目标特征谱之间的距离，演化出可解释、自包含的numpy目标函数实现。在24个无噪声BBOB函数和24个污染缓解型MA-BBOB混合函数上的实验表明：EoTF能稳定生成具有紧密匹配ELA特征的非平凡函数，并在固定评估预算下保持优化器性能排序的一致性，验证了其作为替代基准的有效性。虽然基于神经网络的基线生成器在二维问题上精度更高，但EoTF在三维问题上显著优于基线，且随着维度增加保持稳定的解质量，展现出良好的可扩展性。总体而言，EoTF为针对目标景观特性生成可扩展、可移植且可解释的基准测试提供了实用路径。