The number of proposed iterative optimization heuristics is growing steadily, and with this growth, there have been many points of discussion within the wider community. One particular criticism that is raised towards many new algorithms is their focus on metaphors used to present the method, rather than emphasizing their potential algorithmic contributions. Several studies into popular metaphor-based algorithms have highlighted these problems, even showcasing algorithms that are functionally equivalent to older existing methods. Unfortunately, this detailed approach is not scalable to the whole set of metaphor-based algorithms. Because of this, we investigate ways in which benchmarking can shed light on these algorithms. To this end, we run a set of 294 algorithm implementations on the BBOB function suite. We investigate how the choice of the budget, the performance measure, or other aspects of experimental design impact the comparison of these algorithms. Our results emphasize why benchmarking is a key step in expanding our understanding of the algorithm space, and what challenges still need to be overcome to fully gauge the potential improvements to the state-of-the-art hiding behind the metaphors.
翻译:新提出的迭代优化启发式算法数量持续增长,随之而来的是学术界广泛讨论的诸多焦点问题。针对众多新算法的一个核心批评在于:它们过度关注用于呈现方法的隐喻形式,而非强调其潜在的算法创新价值。多项针对流行隐喻类算法的研究已揭示此类问题,甚至发现某些算法本质上与现有旧方法功能等价。遗憾的是,这种精细化的分析方法无法适用于所有隐喻类算法。为此,我们探索通过基准测试揭示这类算法内在特性的有效途径。我们在BBOB函数测试集上运行了294种算法实现,系统研究了预算规模、性能评价指标及实验设计要素对算法比较结果的影响。研究结果不仅凸显了基准测试在拓展算法空间认知中的关键作用,更揭示了全面评估隐喻背后隐藏的现有技术潜在改进时仍需突破的挑战。