The recent rise in popularity of Hyperparameter Optimization (HPO) for deep learning has highlighted the role that good hyperparameter (HP) space design can play in training strong models. In turn, designing a good HP space is critically dependent on understanding the role of different HPs. This motivates research on HP Importance (HPI), e.g., with the popular method of functional ANOVA (f-ANOVA). However, the original f-ANOVA formulation is inapplicable to the subspaces most relevant to algorithm designers, such as those defined by top performance. To overcome this issue, we derive a novel formulation of f-ANOVA for arbitrary subspaces and propose an algorithm that uses Pearson divergence (PED) to enable a closed-form calculation of HPI. We demonstrate that this new algorithm, dubbed PED-ANOVA, is able to successfully identify important HPs in different subspaces while also being extremely computationally efficient.
翻译:近年来,深度学习超参数优化的兴起凸显了良好超参数空间设计在训练强模型中的关键作用。而设计一个有效的超参数空间,其核心在于理解不同超参数的作用。这推动了超参数重要性(HPI)的研究,例如使用流行的函数方差分析法(f-ANOVA)。然而,原始的f-ANOVA公式无法适用于算法设计者最相关的子空间(例如由顶级性能定义的子空间)。为解决此问题,我们推导了一种适用于任意子空间的f-ANOVA新公式,并提出一种算法,该算法利用皮尔逊散度实现HPI的闭式计算。我们证明,这种名为PED-ANOVA的新算法能够在不同子空间中成功识别重要超参数,同时具有极高的计算效率。