Model selection is a necessary step in unsupervised machine learning. Despite numerous criteria and metrics, model selection remains subjective. A high degree of subjectivity may lead to questions about repeatability and reproducibility of various machine learning studies and doubts about the robustness of models deployed in the real world. Yet, the impact of modelers' preferences on model selection outcomes remains largely unexplored. This study uses the Hidden Markov Model as an example to investigate the subjectivity involved in model selection. We asked 33 participants and three Large Language Models (LLMs) to make model selections in three scenarios. Results revealed variability and inconsistencies in both the participants' and the LLMs' choices, especially when different criteria and metrics disagree. Sources of subjectivity include varying opinions on the importance of different criteria and metrics, differing views on how parsimonious a model should be, and how the size of a dataset should influence model selection. The results underscore the importance of developing a more standardized way to document subjective choices made in model selection processes.
翻译:模型选择是无监督机器学习中的必要步骤。尽管存在众多标准和指标,模型选择仍然具有主观性。高度主观性可能引发对各类机器学习研究可重复性和可再现性的质疑,以及对现实世界部署模型鲁棒性的疑虑。然而,建模者偏好对模型选择结果的影响在很大程度上仍未得到探索。本研究以隐马尔可夫模型为例,探讨模型选择中涉及的主观性。我们邀请33名参与者和三个大型语言模型(LLMs)在三个场景中进行模型选择。结果揭示了参与者和LLMs选择中的变异性和不一致性,特别是在不同标准和指标相互矛盾时。主观性的来源包括对不同标准和指标重要性的不同看法、对模型应如何简练的不同观点,以及数据集规模应如何影响模型选择。研究结果强调了在模型选择过程中以更标准化方式记录主观选择的重要性。