This paper is under review in AI and Ethics This study examines whether large language models (LLMs) can reliably answer scientific questions and demonstrates how easily they can be influenced by fringe scientific material. The authors modified custom LLMs to prioritise knowledge in selected fringe papers on the Fine Structure Constant and Gravitational Waves, then compared their responses with those of domain experts and standard LLMs. The altered models produced fluent, convincing answers that contradicted scientific consensus and were difficult for non-experts to detect as misleading. The results show that LLMs are vulnerable to manipulation and cannot replace expert judgment, highlighting risks for public understanding of science and the potential spread of misinformation.
翻译:本文正在接受《AI与伦理》期刊评审。本研究探讨了大语言模型(LLMs)能否可靠回答科学问题,并展示了它们如何轻易受到边缘科学材料的影响。作者修改定制化LLMs,使其优先关注精细结构常数和引力波领域选定的边缘论文中的知识,然后将它们的回答与领域专家及标准LLMs进行对比。修改后的模型产生了流畅且令人信服的答案,但这些答案与科学共识相矛盾,且非专业人士难以识别其误导性。结果表明,LLMs易受操纵,无法取代专家判断,这凸显了公众对科学理解的风险及潜在的错误信息传播。