Detecting problematic research articles timely is a vital task. This study explores whether Twitter mentions of retracted articles can signal potential problems with the articles prior to retraction, thereby playing a role in predicting future retraction of problematic articles. A dataset comprising 3,505 retracted articles and their associated Twitter mentions is analyzed, alongside 3,505 non-retracted articles with similar characteristics obtained using the Coarsened Exact Matching method. The effectiveness of Twitter mentions in predicting article retraction is evaluated by four prediction methods, including manual labelling, keyword identification, machine learning models, and ChatGPT. Manual labelling results indicate that there are indeed retracted articles with their Twitter mentions containing recognizable evidence signaling problems before retraction, although they represent only a limited share of all retracted articles with Twitter mention data (approximately 16%). Using the manual labelling results as the baseline, ChatGPT demonstrates superior performance compared to other methods, implying its potential in assisting human judgment for predicting article retraction. This study uncovers both the potential and limitation of social media events as an early warning system for article retraction, shedding light on a potential application of generative artificial intelligence in promoting research integrity.
翻译:及时识别问题论文是重要的学术任务。本研究探讨了被撤稿论文在撤稿前的推特提及是否可能预示其潜在问题,从而在预测问题论文的未来撤稿中发挥作用。研究分析了包含3505篇撤稿论文及其相关推特提及的数据集,以及通过粗化精确匹配方法获得的3505篇特征相似的非撤稿论文。通过四种预测方法(包括人工标注、关键词识别、机器学习模型及ChatGPT)评估推特提及对论文撤稿的预测效力。人工标注结果显示,确实存在部分撤稿论文(约占所有拥有推特提及数据的撤稿论文的16%),其推特提及在撤稿前已包含可识别的证据暗示问题。以人工标注结果为基准,ChatGPT相较其他方法展现出更优性能,表明其在辅助人工判断预测论文撤稿方面具有潜力。本研究揭示了社交媒体事件作为论文撤稿早期预警系统的潜力与局限性,并为生成式人工智能在促进科研诚信中的潜在应用提供了启示。