We propose a novel white-box approach to hyper-parameter optimization. Motivated by recent work establishing a relationship between flat minima and generalization, we first establish a relationship between the strong convexity of the loss and its flatness. Based on this, we seek to find hyper-parameter configurations that improve flatness by minimizing the strong convexity of the loss. By using the structure of the underlying neural network, we derive closed-form equations to approximate the strong convexity parameter, and attempt to find hyper-parameters that minimize it in a randomized fashion. Through experiments on 14 classification datasets, we show that our method achieves strong performance at a fraction of the runtime.
翻译:我们提出了一种新颖的白盒超参数优化方法。受近期关于平坦极小值与泛化性关系研究的启发,我们首先建立了损失函数强凸性与其平坦度之间的关联。在此基础上,我们通过最小化损失函数的强凸性来寻找能够提升平坦度的超参数配置。利用底层神经网络的结构特征,我们推导出近似强凸性参数的闭式方程,并以随机化方式尝试寻找使该参数最小化的超参数。通过在14个分类数据集上的实验表明,我们的方法在显著缩短运行时间的同时实现了强劲性能。