AutoML platforms have numerous options for the algorithms to try for each step of the analysis, i.e., different possible algorithms for imputation, transformations, feature selection, and modelling. Finding the optimal combination of algorithms and hyper-parameter values is computationally expensive, as the number of combinations to explore leads to an exponential explosion of the space. In this paper, we present the Sequential Hyper-parameter Space Reduction (SHSR) algorithm that reduces the space for an AutoML tool with negligible drop in its predictive performance. SHSR is a meta-level learning algorithm that analyzes past runs of an AutoML tool on several datasets and learns which hyper-parameter values to filter out from consideration on a new dataset to analyze. SHSR is evaluated on 284 classification and 375 regression problems, showing an approximate 30% reduction in execution time with a performance drop of less than 0.1%.
翻译:AutoML平台为分析步骤的每个环节提供了众多算法选项,例如插补、转换、特征选择和建模中的不同算法。寻找算法与超参数值的最优组合计算代价高昂,因为需要探索的组合数量会导致空间呈指数级爆炸。本文提出了一种顺序超参数空间缩减算法,该算法能在不影响AutoML工具预测性能的前提下缩减其搜索空间。SHSR是一种元级学习算法,通过分析AutoML工具在多个数据集上的历史运行结果,学习如何过滤掉新数据集分析中无需考虑的超参数值。我们在284个分类问题和375个回归问题上评估了SHSR,结果显示执行时间减少约30%,而性能下降不足0.1%。