One of the most critical problems in machine learning is HyperParameter Optimization (HPO), since choice of hyperparameters has a significant impact on final model performance. Although there are many HPO algorithms, they either have no theoretical guarantees or require strong assumptions. To this end, we introduce BLiE -- a Lipschitz-bandit-based algorithm for HPO that only assumes Lipschitz continuity of the objective function. BLiE exploits the landscape of the objective function to adaptively search over the hyperparameter space. Theoretically, we show that $(i)$ BLiE finds an $\epsilon$-optimal hyperparameter with $O \left( \frac{1}{\epsilon} \right)^{d_z + \beta}$ total budgets, where $d_z$ and $\beta$ are problem intrinsic; $(ii)$ BLiE is highly parallelizable. Empirically, we demonstrate that BLiE outperforms the state-of-the-art HPO algorithms on benchmark tasks. We also apply BLiE to search for noise schedule of diffusion models. Comparison with the default schedule shows that BLiE schedule greatly improves the sampling speed.
翻译:机器学习中最关键的问题之一是超参数优化(HPO),因为超参数的选择对最终模型的性能有显著影响。尽管存在许多HPO算法,但它们要么缺乏理论保证,要么需要强假设。为此,我们提出BLiE——一种基于Lipschitz Bandit的HPO算法,该算法仅假设目标函数满足Lipschitz连续性。BLiE利用目标函数的景观特征在超参数空间中进行自适应搜索。理论上,我们证明了:(i)BLiE能够在总预算为$O \left( \frac{1}{\epsilon} \right)^{d_z + \beta}$(其中$d_z$和$\beta$为问题内在参数)的情况下找到$\epsilon$-最优超参数;(ii)BLiE具有高度可并行性。实验上,我们证明BLiE在基准测试任务上优于最先进的HPO算法。我们还将BLiE应用于扩散模型噪声调度方案的搜索,与默认调度方案的比较表明,BLiE调度方案显著提升了采样速度。