Hyperparameter optimization (HPO) is crucial for strong performance of deep learning algorithms and real-world applications often impose some constraints, such as memory usage, or latency on top of the performance requirement. In this work, we propose constrained TPE (c-TPE), an extension of the widely-used versatile Bayesian optimization method, tree-structured Parzen estimator (TPE), to handle these constraints. Our proposed extension goes beyond a simple combination of an existing acquisition function and the original TPE, and instead includes modifications that address issues that cause poor performance. We thoroughly analyze these modifications both empirically and theoretically, providing insights into how they effectively overcome these challenges. In the experiments, we demonstrate that c-TPE exhibits the best average rank performance among existing methods with statistical significance on 81 expensive HPO with inequality constraints. Due to the lack of baselines, we only discuss the applicability of our method to hard-constrained optimization in Appendix D.
翻译:超参数优化(HPO)对于深度学习算法取得优异性能至关重要,而现实应用通常在性能要求之外还施加某些约束,例如内存使用或延迟。本文提出约束TPE(c-TPE),这是对广泛使用的通用贝叶斯优化方法——树形Parzen估计器(TPE)的扩展,用于处理这些约束。我们提出的扩展并非简单组合现有采集函数与原始TPE,而是包含针对导致性能不佳问题的改进措施。我们通过实证与理论两方面深入分析了这些改进,揭示其有效克服挑战的内在机制。实验表明,在81个带不等式约束的昂贵超参数优化问题上,c-TPE在现有方法中具有统计显著的均值排名最优性能。由于缺乏基线方法,我们仅在附录D中讨论本方法对硬约束优化的适用性。