Large language models (LLMs) have rapidly emerged in civil and environmental engineering (CEE) research, education, and practice as tools for project ideation, execution, and communication. However, it is unknown how prevalent LLM adoption is across CEE scholarship and whether it measurably alters research prose. Inspired by recent analyses of biomedical research, this study uses a vocabulary-based frequency-shift methodology to detect linguistic signals of LLM-assisted writing in a large corpus of CEE literature. A total of 149,452 abstracts published by the American Society of Civil Engineers from 2000 through 2025 are analyzed to quantify deviations from long-term vocabulary trends. Prior to the introduction of LLMs in 2022, CEE publications exhibit long-term trends toward longer abstracts and sentences, greater use of segmenting punctuation, higher required reading levels, and a shift toward active, first-person verb constructions. Beginning around 2023, however, the frequencies of many stylistic marker words (e.g., enhance) sharply depart from historical trajectories, accompanied by deviations in multiple semantic properties. Abstracts classified as likely LLM-assisted exhibit increased lexical diversity, comma use, and complexity, with reduced passive voice and hedging language, producing prose that is more segmented, complex, and confident. The AI contribution of this study lies in the use of natural language processing to identify population-level linguistic signals of LLM-assisted text, applied to quantify the prevalence of LLM use and its influence on the vocabulary, structure, and tone of engineering scholarly writing. Together, these findings provide the first large-scale, data-driven assessment of how LLMs are beginning to reshape scholarly communication in CEE.
翻译:大型语言模型(LLMs)已迅速成为土木与环境工程(CEE)研究、教育及实践中用于项目构思、执行和沟通的工具。然而,LLMs在CEE学术领域的普及程度及其是否可测量地改变了研究写作风格尚不明确。受近期生物医学研究分析的启发,本研究采用基于词汇的频率偏移方法,在CEE大规模文献语料库中检测LLM辅助写作的语言信号。研究分析了美国土木工程师学会2000年至2025年间发表的149,452篇摘要,以量化与长期词汇趋势的偏差。在2022年LLM出现之前,CEE出版物呈现出摘要和句子变长、分隔标点使用增加、所需阅读水平提高以及向主动第一人称动词结构转变的长期趋势。然而,自2023年左右起,许多风格标记词(如“增强”)的频率与历史轨迹出现显著偏离,同时多个语义属性也发生偏移。被归类为可能由LLM辅助的摘要表现出词汇多样性、逗号使用和复杂性的提升,被动语态和模糊表达减少,从而产生更加分段化、复杂且自信的写作风格。本研究的AI贡献在于利用自然语言处理识别LLM辅助文本的群体级语言信号,并据此量化LLM使用的普及程度及其对工程学术写作词汇、结构和语气的影响。综合而言,这些发现首次提供了大规模、数据驱动的评估,揭示了LLMs如何开始重塑CEE领域的学术交流。