Science mapping is an important tool to gain insight into scientific fields, to identify emerging research trends, and to support science policy. Understanding the different ways in which different science mapping approaches capture the structure of scientific fields is critical. This paper presents a comparative analysis of two commonly used approaches, topic modeling (TM) and citation-based clustering (CC), to assess their respective strengths, weaknesses, and the characteristics of their results. We compare the two approaches using cluster-to-topic and topic-to-cluster mappings based on science maps of cardiovascular research (CVR) generated by TM and CC. Our findings reveal that relations between topics and clusters are generally weak, with limited overlap between topics and clusters. Only in a few exceptional cases do more than one-third of the documents in a topic belong to the same cluster, or vice versa. CC excels at identifying diseases and generating specialized clusters in Clinical Treatment & Surgical Procedures, while TM focuses on sub-techniques within diagnostic techniques, provides a general perspective on Clinical Treatment & Surgical Procedures, and identifies distinct topics related to practical guidelines. Our work enhances the understanding of science mapping approaches based on TM and CC and delivers practical guidance for scientometricians on how to apply these approaches effectively.
翻译:科学作图是深入了解科学领域、识别新兴研究趋势以及支撑科学政策的重要工具。理解不同科学作图方法在捕捉科学领域结构方面的差异至关重要。本文对两种常用方法——主题建模(TM)和基于引文的聚类(CC)——进行了比较分析,以评估它们各自的优势、劣势及其结果的特征。我们通过基于TM和CC生成的心血管研究(CVR)科学图,采用聚类到主题和主题到聚类的映射来比较这两种方法。研究结果表明,主题与聚类之间的关系通常较弱,主题与聚类之间的重叠有限。仅在少数例外情况下,一个主题中超过三分之一的文档属于同一个聚类,反之亦然。CC在识别疾病和生成临床治疗及外科手术程序方面的专业化聚类表现出色,而TM则侧重于诊断技术中的子技术,提供临床治疗及外科手术程序的总体视角,并识别出与实践指南相关的独特主题。我们的工作增强了对基于TM和CC的科学作图方法的理解,并为科学计量学家如何有效应用这些方法提供了实用指导。