As the amount of data on the World Wide Web continues to grow exponentially, access to semantically structured information remains limited. The Semantic Web has emerged as a solution to enhance the machine-readability of data, making it significantly more accessible and interpretable. Various techniques, such as web scraping and mapping, have been employed by different websites to provide semantic access. Web scraping involves the extraction of valuable information from diverse data sources, such as the World Wide Web, utilizing powerful string manipulation operations.In the research field, researchers face the challenge of collecting relevant data from multiple sources, which requires substantial time and effort. This research aims to address this issue by designing a framework for the semantic organization of research portal data. The framework focuses on the extraction of information from two specific research portals, namely Microsoft Academic and IEEE Xplore. Its primary objective is to gather diverse research-related data from these targeted sources.By implementing this framework, researchers can streamline the process of collecting valuable information for their work, saving time and effort. The semantic organization of research portal data offers enhanced accessibility and interpretability, facilitating more effective and efficient knowledge discovery. This research contributes to the advancement of research data management and promotes the utilization of semantic web technologies in the academic community.
翻译:随着万维网数据呈指数级增长,获取语义结构化信息的途径仍然有限。语义网已成为增强数据机器可读性的解决方案,使数据更易于访问和解析。不同网站采用网页抓取与映射等多种技术提供语义访问。网页抓取通过强大的字符串操作,从万维网等多样化数据源中提取有价值信息。在研究领域,研究者面临从多个来源收集相关数据的挑战,这需要投入大量时间与精力。本研究旨在通过设计研究门户数据的语义组织框架来解决该问题。该框架聚焦于从两个特定研究门户——Microsoft Academic与IEEE Xplore——提取信息,其主要目标是从这些目标来源收集多样化的研究相关数据。通过实施本框架,研究者能够简化收集有价值研究信息的过程,节省时间与精力。研究门户数据的语义组织增强了数据的可访问性与可解释性,有利于更高效的知识发现。本研究推动了研究数据管理的发展,并促进了语义网技术在学术界的应用。