Bayesian optimization is a powerful method for automating tuning of compilers. The complex landscape of autotuning provides a myriad of rarely considered structural challenges for black-box optimizers, and the lack of standardized benchmarks has limited the study of Bayesian optimization within the domain. To address this, we present CATBench, a comprehensive benchmarking suite that captures the complexities of compiler autotuning, ranging from discrete, conditional, and permutation parameter types to known and unknown binary constraints, as well as both multi-fidelity and multi-objective evaluations. The benchmarks in CATBench span a range of machine learning-oriented computations, from tensor algebra to image processing and clustering, and uses state-of-the-art compilers, such as TACO and RISE/ELEVATE. CATBench offers a unified interface for evaluating Bayesian optimization algorithms, promoting reproducibility and innovation through an easy-to-use, fully containerized setup of both surrogate and real-world compiler optimization tasks. We validate CATBench on several state-of-the-art algorithms, revealing their strengths and weaknesses and demonstrating the suite's potential for advancing both Bayesian optimization and compiler autotuning research.
翻译:贝叶斯优化是自动化编译器调优的有效方法。自动调优的复杂格局为黑盒优化器带来了大量鲜少被考虑的结构性挑战,而标准化基准的缺失限制了该领域内贝叶斯优化的研究。为此,我们提出CATBench——一个全面捕捉编译器自动调优复杂性的基准测试套件,涵盖从离散型、条件型及排列型参数类型,到已知与未知的二元约束,以及多保真度与多目标评估等场景。CATBench中的基准测试覆盖了从张量代数到图像处理与聚类等一系列面向机器学习的计算任务,并采用了TACO、RISE/ELEVATE等前沿编译器。CATBench为评估贝叶斯优化算法提供了统一接口,通过易于使用、完全容器化的代理任务与真实编译器优化任务设置,促进了研究的可复现性与创新性。我们在多种先进算法上验证了CATBench,揭示了其优势与不足,并证明了该套件在推动贝叶斯优化与编译器自动调优研究方面的潜力。