OpenAlex is a promising open source of scholarly metadata, and competitor to the established proprietary sources, the Web of Science and Scopus. As OpenAlex provides its data freely and openly, it permits researchers to perform bibliometric studies that can be reproduced in the community without licensing barriers. However, as OpenAlex is a rapidly evolving source and the data contained within is expanding and also quickly changing, the question naturally arises as to the trustworthiness of its data. In this empirical paper, we will study the reference and metadata coverage within each database and compare them with each other to help address this open question in bibliometrics. In our large-scale study, we demonstrate that, when restricted to a cleaned dataset of 16,788,282 recent publications shared by all three databases, OpenAlex has average reference numbers comparable to both Web of Science and Scopus. We also demonstrate that the comparison of other core metadata covered by OpenAlex shows mixed results, with OpenAlex capturing more ORCID identifiers, fewer abstracts and a similar number of Open Access information per article when compared to both Web of Science and Scopus.
翻译:OpenAlex是一个有前景的开放获取学术元数据源,也是Web of Science和Scopus等成熟商业数据源的竞争者。由于OpenAlex免费开放其数据,研究人员得以开展可复现的文献计量研究,且无需受许可壁垒限制。然而,作为快速发展的数据源,其收录内容不断扩充且持续更新,这自然引发了对其数据可信度的疑问。在本实证研究中,我们将分析各数据库的参考文献及元数据覆盖情况,并通过相互对比来解答文献计量学中的这一开放性问题。基于大规模研究发现:在三个数据库共有的16,788,282篇近期出版物的清洗数据集上,OpenAlex的平均参考文献数量与Web of Science及Scopus相当。我们还发现,OpenAlex对其他核心元数据的覆盖呈现差异化结果:相较于Web of Science和Scopus,OpenAlex收录了更多ORCID标识符、更少的摘要,以及每篇论文相似的开放获取信息数量。