In the past decades, many countries have started to fund academic institutions based on the evaluation of their scientific performance. In this context, post-publication peer review is often used to assess scientific performance. Bibliometric indicators have been suggested as an alternative to peer review. A recurrent question in this context is whether peer review and metrics tend to yield similar outcomes. In this paper, we study the agreement between bibliometric indicators and peer review based on a sample of publications submitted for evaluation to the national Italian research assessment exercise (2011--2014). In particular, we study the agreement between bibliometric indicators and peer review at a higher aggregation level, namely the institutional level. Additionally, we also quantify the internal agreement of peer review at the institutional level. We base our analysis on a hierarchical Bayesian model using cross-validation. We find that the level of agreement is generally higher at the institutional level than at the publication level. Overall, the agreement between metrics and peer review is on par with the internal agreement among two reviewers for certain fields of science in this particular context. This suggests that for some fields, bibliometric indicators may possibly be considered as an alternative to peer review for the Italian national research assessment exercise. Although results do not necessarily generalise to other contexts, it does raise the question whether similar findings would obtain for other research assessment exercises, such as in the United Kingdom.
翻译:在过去几十年中,许多国家开始基于对学术机构科研绩效的评估进行拨款。在此背景下,发表后同行评议常被用于评估科研绩效,而文献计量指标也被提议作为同行评议的替代方案。一个反复出现的问题是:同行评议与指标是否倾向于得出相似结果。本文基于意大利国家科研评估计划(2011-2014年)中提交的出版物样本,研究了文献计量指标与同行评议之间的一致性。我们特别关注更高聚合层级(即机构层面)上两类评估方法的一致性,同时量化了机构层面同行评议的内部一致性。基于交叉验证方法,我们采用层次贝叶斯模型进行分析。研究发现:机构层面的评估一致性普遍高于出版物层面。在特定学科背景下,文献计量指标与同行评议的一致性水平与两位评审人之间的内部一致性相当。这表明在意大利国家科研评估计划中,某些学科领域或可考虑将文献计量指标作为同行评议的替代方案。尽管研究结论未必能推广至其他评估体系,但引发了一个问题:英国等其他国家的科研评估是否会得出相似结论。