Ontologies play a critical role in Semantic Web technologies by providing a structured and standardized way to represent knowledge and enabling machines to understand the meaning of data. Several taxonomies and ontologies have been generated, but individuals target one domain, and only some of those have been found expensive in time and manual effort. Also, they need more coverage of unconventional topics representing a more holistic and comprehensive view of the knowledge landscape and interdisciplinary collaborations. Thus, there needs to be an ontology covering Science and Technology and facilitate multidisciplinary research by connecting topics from different fields and domains that may be related or have commonalities. To address these issues, we present an automatic Science and Technology Ontology (S&TO) that covers unconventional topics in different science and technology domains. The proposed S&TO can promote the discovery of new research areas and collaborations across disciplines. The ontology is constructed by applying BERTopic to a dataset of 393,991 scientific articles collected from Semantic Scholar from October 2021 to August 2022, covering four fields of science. Currently, S&TO includes 5,153 topics and 13,155 semantic relations. S&TO model can be updated by running BERTopic on more recent datasets
翻译:本体论在语义网技术中发挥着关键作用,通过提供结构化和标准化的知识表示方式,使机器能够理解数据的含义。目前已生成多个分类体系和本体,但各本体仅针对单一领域,且其中部分本体在时间和人力成本上极为昂贵。此外,这些本体对代表知识全景和跨学科协作中非常规主题的覆盖面存在不足。因此,亟需构建覆盖科学技术领域、通过关联不同领域及可能存在共性或关联性的主题来促进多学科研究的本体。为解决上述问题,我们提出一种自动构建的科学技术本体(S&TO),该本体覆盖不同科学技术领域的非常规主题。所提出的S&TO可促进新研究领域的发现及跨学科合作。该本体通过将BERTopic应用于2021年10月至2022年8月期间从Semantic Scholar收集的393,991篇科学论文数据集构建而成,涵盖四个科学领域。目前S&TO包含5,153个主题和13,155个语义关系。通过使用更新数据集运行BERTopic,可对S&TO模型进行迭代更新。