Representing unstructured data in a structured form is most significant for information system management to analyze and interpret it. To do this, the unstructured data might be converted into Knowledge Graphs, by leveraging an information extraction pipeline whose main tasks are named entity recognition and relation extraction. This thesis aims to develop a novel continual relation extraction method to identify relations (interconnections) between entities in a data stream coming from the real world. Domain-specific data of this thesis is corona news from German and Austrian newspapers.
翻译:将非结构化数据以结构化形式表示,对于信息系统管理分析与理解数据至关重要。为此,可通过信息抽取流水线将非结构化数据转化为知识图谱,其核心任务包括命名实体识别与关系抽取。本文旨在提出一种新型持续关系抽取方法,用于识别来自现实世界数据流中实体间的关系(相互联系)。本文的领域特定数据为德国与奥地利报纸中关于新冠病毒的新闻。