Scientists often use generics, that is, unquantified statements about whole categories of people or phenomena, when communicating research findings (e.g., "statins reduce cardiovascular events"). Large language models (LLMs), such as ChatGPT, frequently adopt the same style when summarizing scientific texts. However, generics can prompt overgeneralizations, especially when they are interpreted differently across audiences. In a study comparing laypeople, scientists, and two leading LLMs (ChatGPT-5 and DeepSeek), we found systematic differences in interpretation of generics. Compared to most scientists, laypeople judged scientific generics as more generalizable and credible, while LLMs rated them even higher. These mismatches highlight significant risks for science communication. Scientists may use generics and incorrectly assume laypeople share their interpretation, while LLMs may systematically overgeneralize scientific findings when summarizing research. Our findings underscore the need for greater attention to language choices in both human and LLM-mediated science communication.
翻译:科学家在传播研究发现时常使用泛化表述,即对整个人群或现象类别进行未量化的陈述(例如“他汀类药物减少心血管事件”)。大型语言模型(如ChatGPT)在总结科学文本时也频繁采用相同风格。然而,泛化表述可能引发过度概括,尤其当不同受众对其理解存在差异时。通过比较公众、科学家及两种主流大型语言模型(ChatGPT-5与DeepSeek)的研究发现,我们对泛化表述的理解存在系统性差异。与大多数科学家相比,公众认为科学泛化表述更具普适性和可信度,而大型语言模型的评分甚至更高。这些认知错位凸显了科学传播中的重大风险:科学家可能使用泛化表述并错误假定公众与其理解一致,而大型语言模型在总结研究时可能系统性地过度概括科学发现。我们的研究结果强调,在人类与大型语言模型介导的科学传播中,都需要更加关注语言选择的重要性。