We report evidence of an undocumented method to manipulate citation counts involving 'sneaked' references. Sneaked references are registered as metadata for scientific articles in which they do not appear. This manipulation exploits trusted relationships between various actors: publishers, the Crossref metadata registration agency, digital libraries, and bibliometric platforms. By collecting metadata from various sources, we show that extra undue references are actually sneaked in at Digital Object Identifier (DOI) registration time, resulting in artificially inflated citation counts. As a case study, focusing on three journals from a given publisher, we identified at least 9% sneaked references (5,978/65,836) mainly benefiting two authors. Despite not existing in the articles, these sneaked references exist in metadata registries and inappropriately propagate to bibliometric dashboards. Furthermore, we discovered 'lost' references: the studied bibliometric platform failed to index at least 56% (36,939/65,836) of the references listed in the HTML version of the publications. The extent of the sneaked and lost references in the global literature remains unknown and requires further investigations. Bibliometric platforms producing citation counts should identify, quantify, and correct these flaws to provide accurate data to their patrons and prevent further citation gaming.
翻译:我们报告了一种涉及“隐匿引用”的未公开引文计数操纵方法的证据。隐匿引用被注册为科学文章的元数据,但实际并不出现在文章正文中。这种操纵利用了出版商、Crossref元数据注册机构、数字图书馆及文献计量平台等各方之间的信任关系。通过收集来自多个来源的元数据,我们发现额外的不当引用实际上是在数字对象标识符(DOI)注册阶段悄悄植入的,从而导致引文计数被人为抬高。以某出版商的三种期刊作为案例研究,我们识别出至少9%的隐匿引用(5,978/65,836条),这些引用主要使两位作者受益。尽管这些引用不存在于文章中,它们仍存在于元数据注册库中,并被不当传播至文献计量仪表板。此外,我们还发现了“丢失”的引用:研究对象文献计量平台未能索引出版物HTML版本中至少56%(36,939/65,836条)的参考文献。全球文献中隐匿引用与丢失引用的实际规模尚不明确,需要进一步调查。生成引文计数的文献计量平台应识别、量化并纠正这些缺陷,以便为使用者提供准确数据,并防止进一步的引文操纵行为。