Wikipedia serves as a key infrastructure for public access to scientific knowledge, but it faces challenges in maintaining the credibility of cited sources--especially when scientific papers are retracted. This paper investigates how citations to retracted research are handled on English Wikipedia. We construct a novel dataset that integrates Wikipedia revision histories with metadata from Retraction Watch, Crossref, Altmetric, and OpenAlex, identifying 1,181 citations of retracted papers. We find that 71.6% of the citations were initially problematic and in need of reader-facing repair, defined as those added before the paper's retraction (51.5%) or introduced afterwards without proper warning (20.1%). While many are eventually corrected, our analysis reveals that these citations persist for a median of 3.68 years (1,344 days). Through survival analysis, we find that bot-mediated flagging (RetractionBot), open access availability, pre-existing online visibility (e.g., Twitter/X mention counts), and page-level organization (e.g., number of categories on a Wikipedia page) are associated with a higher hazard of correction. Conversely, a paper's established scholarly authority--a higher academic citation count--is associated with a slower time to correction. Our findings highlight how the Wikipedia community supports collaborative maintenance but leaves gaps in citation-level repair. We contribute to CSCW research by advancing our understanding of this sociotechnical vulnerability, which takes the form of a community coordination challenge, and by offering design directions to support citation credibility at scale.
翻译:维基百科作为公众获取科学知识的关键基础设施,在维护引用来源可信度方面面临挑战——尤其是在科学论文被撤稿的情况下。本文研究了英语维基百科如何处理对已撤稿研究的引用。我们构建了一个新颖的数据集,整合了维基百科修订历史与来自Retraction Watch、Crossref、Altmetric和OpenAlex的元数据,识别出1,181条被撤稿论文的引用。我们发现,71.6%的引用最初存在问题并需要面向读者的修复,这些引用被定义为在论文撤稿前添加(51.5%)或在撤稿后引入但未提供适当警示(20.1%)。虽然许多引用最终得到修正,但我们的分析显示,这些引用持续存在的中位时间为3.68年(1,344天)。通过生存分析,我们发现机器人介导的标记(RetractionBot)、开放获取可用性、已有的在线可见性(例如Twitter/X提及次数)以及页面级组织(例如维基百科页面上的分类数量)与更高的修正风险相关。相反,论文已建立的学术权威性——即更高的学术引用次数——与更长的修正时间相关。我们的研究结果突显了维基百科社区如何支持协作维护,但在引用级修复方面仍存在不足。我们通过深化对这一社会技术脆弱性(表现为社区协调挑战)的理解,并提供支持大规模引用可信度的设计方向,为CSCW研究做出贡献。