The use of ChatGPT and similar Large Language Model (LLM) tools in scholarly communication and academic publishing has been widely discussed since they became easily accessible to a general audience in late 2022. This study uses keywords known to be disproportionately present in LLM-generated text to provide an overall estimate for the prevalence of LLM-assisted writing in the scholarly literature. For the publishing year 2023, it is found that several of those keywords show a distinctive and disproportionate increase in their prevalence, individually and in combination. It is estimated that at least 60,000 papers (slightly over 1% of all articles) were LLM-assisted, though this number could be extended and refined by analysis of other characteristics of the papers or by identification of further indicative keywords.
翻译:ChatGPT及类似大型语言模型工具在学术交流与出版中的使用,自2022年底向公众开放以来已引发广泛讨论。本研究利用已知在LLM生成文本中异常高频出现的关键词,为LLM辅助写作在学术文献中的总体普及率提供估算。针对2023年出版年份,研究发现多个此类关键词的单独及组合出现频率均呈现显著且不成比例的增长。据估算,至少有60,000篇论文(略超全部文献的1%)涉及LLM辅助,但这一数字可通过分析论文的其他特征或识别更多指示性关键词进行扩展与细化。