The recent rise in popularity of Hyperparameter Optimization (HPO) for deep learning has highlighted the role that good hyperparameter (HP) space design can play in training strong models. In turn, designing a good HP space is critically dependent on understanding the role of different HPs. This motivates research on HP Importance (HPI), e.g., with the popular method of functional ANOVA (f-ANOVA). However, the original f-ANOVA formulation is inapplicable to the subspaces most relevant to algorithm designers, such as those defined by top performance. To overcome this issue, we derive a novel formulation of f-ANOVA for arbitrary subspaces and propose an algorithm that uses Pearson divergence (PED) to enable a closed-form calculation of HPI. We demonstrate that this new algorithm, dubbed PED-ANOVA, is able to successfully identify important HPs in different subspaces while also being extremely computationally efficient.
翻译:近年来,深度学习超参数优化(HPO)的兴起凸显了良好的超参数(HP)空间设计在训练强模型中的关键作用。而设计优质HP空间的核心前提在于理解不同HP的作用机制。这推动了超参数重要性(HPI)研究的发展,例如流行的函数化方差分析法(f-ANOVA)。然而,原始的f-ANOVA公式无法适用于算法设计者最关注的子空间(如由最优性能定义的子空间)。为解决这一问题,我们推导出面向任意子空间的f-ANOVA新公式,并提出一种基于皮尔逊散度(PED)的算法,以实现HPI的封闭形式计算。实验证明,这种名为PED-ANOVA的新算法不仅能成功识别不同子空间中的重要HP,同时具有极高的计算效率。