This study presents a comprehensive approach that addresses the challenges of identification and analysis of research articles in rapidly evolving fields, using the field of Artificial Intelligence (AI) as a case study. By combining search terms related to AI with the advanced language processing capabilities of generative pre-trained transformers (GPT), we developed a highly accurate method for identifying and analyzing AI-related articles in the Web of Science (WoS) database. Our multi-step approach included filtering articles based on WoS citation topics and category, keyword screening, and GPT classification. We evaluated the effectiveness of our method through precision and recall calculations, finding that our combined approach captured around 94% of AI-related articles in the entire WoS corpus with a precision of 90%. Following this, we analyzed the publication volume trends, revealing an increasing degree of interdisciplinarity. We conducted citation analysis on the top countries and institutions and identified common research themes using keyword analysis and GPT. This study demonstrates the potential of our approach as a tool for the accurate identification of scholarly articles, which is also capable of providing insights into the growth, interdisciplinary nature, and key players in a research area.
翻译:本研究提出了一种综合方法,用于解决快速演变领域中研究文章的识别与分析挑战,并以人工智能领域作为案例研究。通过将人工智能相关搜索词与生成式预训练Transformer(GPT)的高级语言处理能力相结合,我们开发了一种高精度方法,用于识别和分析Web of Science(WoS)数据库中与人工智能相关的文章。我们的多步骤方法包括基于WoS引文主题和类别过滤文章、关键词筛选以及GPT分类。我们通过精确率和召回率计算评估了该方法的有效性,发现我们的综合方法在WoS整个语料库中捕获了约94%的人工智能相关文章,精确率达到90%。随后,我们分析了出版量趋势,揭示了跨学科性的日益增强。我们对排名前列的国家和机构进行了引文分析,并利用关键词分析和GPT识别了共同的研究主题。本研究证明了我们的方法作为学术文章精确识别工具的潜力,该方法还能为研究领域的增长、跨学科性质及关键参与者提供见解。