The recent rise in popularity of Hyperparameter Optimization (HPO) for deep learning has highlighted the role that good hyperparameter (HP) space design can play in training strong models. In turn, designing a good HP space is critically dependent on understanding the role of different HPs. This motivates research on HP Importance (HPI), e.g., with the popular method of functional ANOVA (f-ANOVA). However, the original f-ANOVA formulation is inapplicable to the subspaces most relevant to algorithm designers, such as those defined by top performance. To overcome this issue, we derive a novel formulation of f-ANOVA for arbitrary subspaces and propose an algorithm that uses Pearson divergence (PED) to enable a closed-form calculation of HPI. We demonstrate that this new algorithm, dubbed PED-ANOVA, is able to successfully identify important HPs in different subspaces while also being extremely computationally efficient.
翻译:近年来深度学习领域中超参数优化(HPO)的日益普及,凸显了良好的超参数(HP)空间设计在训练强模型中的关键作用。而设计良好的HP空间高度依赖于对不同HP作用的理解。这推动了对超参数重要性(HPI)的研究,例如采用流行的功能型ANOVA(f-ANOVA)方法。然而,原始的f-ANOVA公式无法适用于算法设计者最相关的子空间(例如由顶级性能定义的子空间)。为解决此问题,我们推导出一种针对任意子空间的f-ANOVA新公式,并提出一种利用皮尔逊散度(PED)实现HPI闭合形式计算的算法。实验表明,这种新算法(名为PED-ANOVA)不仅能在不同子空间中成功识别重要HP,同时具有极高的计算效率。