Bibliometric and Scientometric analyses offer invaluable perspectives on the complex research terrain and collaborative dynamics spanning diverse academic disciplines. This paper presents pyBibX, a python library devised to conduct comprehensive bibliometric and scientometric analyses on raw data files sourced from Scopus, Web of Science, and PubMed, seamlessly integrating state of the art AI capabilities into its core functionality. The library executes a comprehensive EDA, presenting outcomes via visually appealing graphical illustrations. Network capabilities have been deftly integrated, encompassing Citation, Collaboration, and Similarity Analysis. Furthermore, the library incorporates AI capabilities, including Embedding vectors, Topic Modeling, Text Summarization, and other general Natural Language Processing tasks, employing models such as Sentence-BERT, BerTopic, BERT, chatGPT, and PEGASUS. As a demonstration, we have analyzed 184 documents associated with multiple-criteria decision analysis published between 1984 and 2023. The EDA emphasized a growing fascination with decision-making and fuzzy logic methodologies. Next, Network Analysis further accentuated the significance of central authors and intra-continental collaboration, identifying Canada and China as crucial collaboration hubs. Finally, AI Analysis distinguished two primary topics and chatGPT preeminence in Text Summarization. It also proved to be an indispensable instrument for interpreting results, as our library enables researchers to pose inquiries to chatGPT regarding bibliometric outcomes. Even so, data homogeneity remains a daunting challenge due to database inconsistencies. PyBibX is the first application integrating cutting-edge AI capabilities for analyzing scientific publications, enabling researchers to examine and interpret these outcomes more effectively.
翻译:文献计量与科学计量分析为理解跨学科领域的复杂研究格局及合作动态提供了宝贵的视角。本文介绍了pyBibX——一个旨在对来自Scopus、Web of Science及PubMed的原始数据文件进行综合文献计量与科学计量分析的Python库,该库将前沿人工智能能力无缝集成至其核心功能中。该库执行全面的探索性数据分析,并通过可视化图形展示分析结果。网络分析能力已被巧妙整合,涵盖引文分析、合作分析与相似性分析。此外,该库集成了人工智能能力,包括嵌入向量、主题建模、文本摘要及其他通用自然语言处理任务,采用诸如Sentence-BERT、BerTopic、BERT、ChatGPT及PEGASUS等模型。作为示范,我们对1984年至2023年间发表的184篇与多准则决策分析相关的文献进行了分析。探索性数据分析强调了决策与模糊逻辑方法日益增长的吸引力。随后,网络分析进一步突出了核心作者及洲内合作的重要性,并将加拿大和中国识别为关键合作枢纽。最后,人工智能分析区分了两种主要主题,并揭示了ChatGPT在文本摘要中的卓越表现。该库还被证明是解释结果的不可或缺工具,因为它使研究者能够就文献计量结果向ChatGPT提问。尽管如此,由于数据库的不一致性,数据同质性仍是一项艰巨挑战。PyBibX是首个集成前沿人工智能能力以分析科学出版物的应用,使研究者能够更有效地审视和解读这些成果。