Hyperparameter optimization (HPO) and neural architecture search (NAS) are methods of choice to obtain the best-in-class machine learning models, but in practice they can be costly to run. When models are trained on large datasets, tuning them with HPO or NAS rapidly becomes prohibitively expensive for practitioners, even when efficient multi-fidelity methods are employed. We propose an approach to tackle the challenge of tuning machine learning models trained on large datasets with limited computational resources. Our approach, named PASHA, extends ASHA and is able to dynamically allocate maximum resources for the tuning procedure depending on the need. The experimental comparison shows that PASHA identifies well-performing hyperparameter configurations and architectures while consuming significantly fewer computational resources than ASHA.
翻译:超参数优化(HPO)与神经架构搜索(NAS)是获取最佳机器学习模型的首选方法,但在实际应用中往往计算成本高昂。当模型在大规模数据集上训练时,即使采用高效的多保真度方法,通过HPO或NAS进行调优对从业者而言也会迅速变得昂贵得难以承受。我们提出了一种在有限计算资源条件下调优大规模数据集上训练模型的方法。该方法名为PASHA,通过扩展ASHA框架,能够根据实际需求动态分配调优过程的最大资源量。实验对比表明,PASHA在消耗显著少于ASHA计算资源的前提下,能够识别出性能优异的超参数配置与架构。