For many public research organizations, funding creation of science and maximizing scientific output is of central interest. Typically, when evaluating scientific production for funding, citations are utilized as a proxy, although these are severely influenced by factors beyond scientific impact. This study aims to mitigate the consequences of the Matthew effect in citations, where prominent authors and prestigious journals receive more citations regardless of the scientific content of the publications. To this end, the study presents an approach to predicting citations of papers based solely on observable characteristics available at the submission stage of a double-blind peer-review process. Combining classical linear models, generalized linear models and utilizing large-scale data sets on biomedical papers based on the PubMed database, the results demonstrate that it is possible to make fairly accurate predictions of citations using only observable characteristics of papers excluding information on authors and journals, thereby mitigating the Matthew effect. Thus, the outcomes have important implications for the field of scientometrics, providing a more objective method for citation prediction by relying on pre-publication variables that are immune to manipulation by authors and journals, thereby enhancing the objectivity of the evaluation process. Our approach is thus important for government agencies responsible for funding the creation of high-quality scientific content rather than perpetuating prestige.
翻译:对于许多公共研究机构而言,资助科学创造与最大化科研产出是其核心关切。在评估科研产出来分配资金时,通常将引用次数作为衡量指标,尽管这些引用受到科学影响力之外因素的严重影响。本研究旨在缓解引用中的马太效应——即知名作者和高声望期刊无论其出版物的科学内容如何,均会获得更多引用。为此,本研究提出一种方法,仅基于双盲同行评审提交阶段可观察的特征来预测论文的引用情况。通过结合经典线性模型、广义线性模型,并利用基于PubMed数据库的大规模生物医学论文数据集,结果表明,仅使用论文的可观察特征(排除作者和期刊信息)即可对引用次数做出相当准确的预测,从而缓解马太效应。因此,该成果对科学计量学领域具有重要意义,它提供了一种更客观的引用预测方法,该方法依赖于出版前变量,这些变量不受作者和期刊操纵的影响,从而增强了评估过程的客观性。我们的方法对于负责资助高质量科学内容创造而非延续声望的政府机构尤为重要。