Information presented in Wikipedia articles must be attributable to reliable published sources in the form of references. This study examines over 5 million Wikipedia articles to assess the reliability of references in multiple language editions. We quantify the cross-lingual patterns of the perennial sources list, a collection of reliability labels for web domains identified and collaboratively agreed upon by Wikipedia editors. We discover that some sources (or web domains) deemed untrustworthy in one language (i.e., English) continue to appear in articles in other languages. This trend is especially evident with sources tailored for smaller communities. Furthermore, non-authoritative sources found in the English version of a page tend to persist in other language versions of that page. We finally present a case study on the Chinese, Russian, and Swedish Wikipedias to demonstrate a discrepancy in reference reliability across cultures. Our finding highlights future challenges in coordinating global knowledge on source reliability.
翻译:维基百科条目的信息必须通过引用可靠出版来源进行归因。本研究对超过500万篇维基百科文章进行了分析,评估了多语言版本的参考文献可靠性。我们量化了"长期来源清单"的跨语言模式——该清单由维基百科编辑者共同识别并达成共识的网页域名可靠性标签集合构成。研究发现,在单一语言(如英语)中被认定为不可信的来源(或网页域名)仍会出现在其他语言的条目中。这种趋势在面向小型社群的来源中尤为显著。此外,英文版本页面中存在的非权威来源往往会在该页面的其他语言版本中持续存在。我们最后以中文、俄文和瑞典文维基百科为例进行案例研究,揭示了跨文化参考文献可靠性的差异。本研究结果凸显了未来在全球范围内协调来源可靠性知识的挑战。