Modern Machine Learning (ML) and Artificial Intelligence (AI) models, especially large language models (LLMs), are increasingly used to generate scientific hypotheses and mechanistic explanations from observational data. This position paper argues that in the high-dimensional proxy regimes where modern ML excels, mechanistic learning is generically underdetermined: many incompatible mechanisms induce essentially the same observational relationships on the support of the data, so predictive success and coherent explanations are insufficient evidence of mechanism discovery. This underdetermination becomes uniquely hazardous with large language models (LLMs), which tend to collapse large equivalence classes of explanations into a single fluent narrative. This paper proposes concrete standards for ``mechanistic ML,'' and argues these norms are necessary if LLM-centered workflows are to support science rather than merely simulate it.
翻译:现代机器学习(ML)与人工智能(AI)模型,特别是大型语言模型(LLMs),日益被用于从观测数据中生成科学假说与机理解释。本文立场认为,在现代机器学习擅长的、高维代理数据环境下,机理学习普遍存在欠定性问题:大量不相容的机理在数据支撑域上诱导出本质上相同的观测关系,因此预测成功与连贯的解释并不能作为机理发现的充分证据。这种欠定性在大型语言模型(LLMs)中尤为危险——它们倾向于将大规模等价类解释压缩为单一流畅的叙事。本文提出了“机理性机器学习”的具体标准,并论证:若以LLM为核心的工作流要支持科学而非仅模拟科学,这些规范必不可少。