Advanced programmatic hyperparameter optimization (HPO) methods, such as Bayesian optimization, have high sample efficiency in reproducibly finding optimal hyperparameter values of machine learning (ML) models. Yet, ML practitioners often apply less sample-efficient HPO methods, such as grid search, which often results in under-optimized ML models. As a reason for this behavior, we suspect practitioners choose HPO methods based on individual motives, consisting of contextual factors and individual goals. However, practitioners' motives still need to be clarified, hindering the evaluation of HPO methods for achieving specific goals and the user-centered development of HPO tools. To understand practitioners' motives for using specific HPO methods, we used a mixed-methods approach involving 20 semi-structured interviews and a survey study with 71 ML experts to gather evidence of the external validity of the interview results. By presenting six main goals (e.g., improving model understanding) and 14 contextual factors affecting practitioners' selection of HPO methods (e.g., available computer resources), our study explains why practitioners use HPO methods that seem inappropriate at first glance. This study lays a foundation for designing user-centered and context-adaptive HPO tools and, thus, linking social and technical research on HPO.
翻译:先进的程序化超参数优化(HPO)方法(如贝叶斯优化)在可重复地寻找机器学习模型的最优超参数值方面具有较高的样本效率。然而,机器学习从业者常常采用样本效率较低的HPO方法(如网格搜索),这往往导致模型优化不足。我们认为,这种行为的原因在于从业者基于个体动机选择HPO方法,这些动机包括情境因素和个人目标。然而,从业者的动机尚未明确,这阻碍了针对特定目标评估HPO方法以及以用户为中心的HPO工具开发。为了理解从业者使用特定HPO方法的动机,我们采用了混合研究方法,包括20次半结构化访谈和一项针对71位机器学习专家的问卷调查,以收集访谈结果的外部有效性证据。通过提出影响从业者选择HPO方法的六大主要目标(如提升模型理解)和14种情境因素(如可用计算资源),我们的研究解释了为何从业者会使用那些乍看之下似乎不合适的HPO方法。本研究为设计以用户为中心且情境自适应的HPO工具奠定了基础,从而连接了HPO的社会与技术研究。