Hyperparameter Optimization (HPO) of Deep Learning-based models tends to be a compute resource intensive process as it usually requires to train the target model with many different hyperparameter configurations. We show that integrating model performance prediction with early stopping methods holds great potential to speed up the HPO process of deep learning models. Moreover, we propose a novel algorithm called Swift-Hyperband that can use either classical or quantum support vector regression for performance prediction and benefit from distributed High Performance Computing environments. This algorithm is tested not only for the Machine-Learned Particle Flow model used in High Energy Physics, but also for a wider range of target models from domains such as computer vision and natural language processing. Swift-Hyperband is shown to find comparable (or better) hyperparameters as well as using less computational resources in all test cases.
翻译:深度学习模型的超参数优化通常计算资源消耗巨大,因为需要训练目标模型在众多不同超参数配置下的表现。我们证明,将模型性能预测与早停方法相结合,对加速深度学习模型的超参数优化过程具有巨大潜力。此外,我们提出了一种名为Swift-Hyperband的新型算法,该算法可利用经典或量子支持向量回归进行性能预测,并受益于分布式高性能计算环境。该算法不仅针对高能物理中使用的机器学习粒子流模型进行了测试,还应用于计算机视觉和自然语言处理等领域的更广泛目标模型。在所有测试案例中,Swift-Hyperband均能找到相当(或更优)的超参数,同时使用较少计算资源。