Scientific novelty is a critical construct in bibliometrics and is commonly measured by aggregating pairwise distances between the knowledge units underlying a paper. While prior work has refined how such distances are computed, less attention has been paid to how dyadic relations are aggregated to characterize novelty at the paper level. We address this limitation by introducing a network-based indicator, Cognitive Traversal Distance (CTD). Conceptualizing the historical literature as a weighted knowledge network, CTD is defined as the length of the shortest path required to connect all knowledge units associated with a paper. CTD provides a paper-level novelty measure that reflects the minimal structural distance needed to integrate multiple knowledge units, moving beyond mean- or quantile-based aggregation of pairwise distances. Using 27 million biomedical publications indexed by OpenAlex and Medical Subject Headings (MeSH) as standardized knowledge units, we evaluate CTD against expert-based novelty benchmarks from F1000Prime-recommended papers and Nobel Prize-winning publications. CTD consistently outperforms conventional aggregation-based indicators. We further show that MeSH-based CTD is less sensitive to novelty driven by the emergence of entirely new conceptual labels, clarifying its scope relative to recent text-based measures.
翻译:科学新颖性是文献计量学中的一个关键概念,通常通过聚合论文所基于知识单元之间的成对距离来度量。尽管先前的研究改进了此类距离的计算方法,但对于如何聚合二元关系以在论文层面表征新颖性,关注较少。我们通过引入一种基于网络的指标——认知遍历距离(CTD)——来应对这一局限。CTD将历史文献概念化为一个加权知识网络,其定义为连接一篇论文所关联的所有知识单元所需的最短路径长度。CTD提供了一个论文层面的新颖性度量,反映了整合多个知识单元所需的最小结构距离,超越了基于均值或分位数的成对距离聚合方法。我们使用OpenAlex索引的2700万篇生物医学出版物及其标准化知识单元——医学主题词(MeSH)作为评估基础,将CTD与基于专家判断的新颖性基准(来自F1000Prime推荐论文和诺贝尔奖获奖出版物)进行比较。CTD始终优于传统的基于聚合的指标。我们进一步表明,基于MeSH的CTD对完全由全新概念标签出现所驱动的新颖性较不敏感,这澄清了其相对于近期基于文本的度量方法的适用范围。