Information presented in Wikipedia articles must be attributable to reliable published sources in the form of references. This study examines over 5 million Wikipedia articles to assess the reliability of references in multiple language editions. We quantify the cross-lingual patterns of the perennial sources list, a collection of reliability labels for web domains identified and collaboratively agreed upon by Wikipedia editors. We discover that some sources (or web domains) deemed untrustworthy in one language (i.e., English) continue to appear in articles in other languages. This trend is especially evident with sources tailored for smaller communities. Furthermore, non-authoritative sources found in the English version of a page tend to persist in other language versions of that page. We finally present a case study on the Chinese, Russian, and Swedish Wikipedias to demonstrate a discrepancy in reference reliability across cultures. Our finding highlights future challenges in coordinating global knowledge on source reliability.
翻译:维基百科文章中的信息必须通过参考文献归因于可靠的公开来源。本研究检视超过500万篇维基百科文章,评估多语种版本中参考文献的可靠性。我们量化了"长期来源列表"的跨语言模式——该列表由维基百科编辑共同识别并达成共识的网络域名可靠性标签集合。研究发现:某些在单一语言(如英语)中被认定为不可靠的来源(或网络域名)仍持续出现在其他语言版本的词条中。这一趋势在面向小型社群的来源中尤为显著。此外,英文版本页面中存在的非权威来源倾向于延续至该页面的其他语言版本。我们最终以中文、俄文和瑞典文维基百科为案例研究,展示了跨文化背景下参考文献可靠性的差异。研究发现凸显了未来在来源可靠性问题上协调全球知识所面临的挑战。