Automated Machine Learning (AutoML) automatically builds machine learning (ML) models on data. The de facto standard for evaluating new AutoML frameworks for tabular data is the AutoML Benchmark (AMLB). AMLB proposed to evaluate AutoML frameworks using 1- and 4-hour time budgets across 104 tasks. We argue that shorter time constraints should be considered for the benchmark because of their practical value, such as when models need to be retrained with high frequency, and to make AMLB more accessible. This work considers two ways in which to reduce the overall computation used in the benchmark: smaller time constraints and the use of early stopping. We conduct evaluations of 11 AutoML frameworks on 104 tasks with different time constraints and find the relative ranking of AutoML frameworks is fairly consistent across time constraints, but that using early-stopping leads to a greater variety in model performance.
翻译:自动化机器学习(AutoML)能够基于数据自动构建机器学习模型。当前评估表格数据AutoML框架的事实标准是自动化机器学习基准测试(AMLB)。AMLB提出在104个任务上分别以1小时和4小时的时间预算评估AutoML框架。我们认为,考虑到实际应用价值(例如需要高频重训练模型的场景)以及提升AMLB的可及性,基准测试应当纳入更短的时间约束。本研究探讨了两种降低基准测试总体计算量的方法:缩短时间约束与采用早停策略。我们在104个任务上对11个AutoML框架进行了不同时间约束下的评估,发现各框架的相对排名在不同时间约束下保持较高一致性,但采用早停策略会导致模型性能出现更大差异。