Semantic analysis based on knowledge graphs requires a relevant subgraph of a reasonable size. Existing approaches have three issues that impede the integration of such subgraphs. First, there is no off-the-shelf framework for semantic-relevant subgraph retrieval. Second, existing approaches are knowledge-graph-dependent, resulting in outdated knowledge graphs even in recent studies. Third, existing approaches are flawed either in entity linking or path expansion, which often results in huge subgraphs. In this paper, we present SRTK, a user-friendly toolkit for semantic-relevant subgraph retrieval from large-scale knowledge graphs. SRTK is the first toolkit that streamlines the entire lifecycle of subgraph retrieval, from development (preprocessing, training, and evaluation) to applications (entity linking, retrieving and visualizing). Moreover, It supports Wikidata, Freebase and DBpedia by defining unified access interfaces across different knowledge graphs. Additionally, it ships with a state-of-the-art subgraph retrieval algorithm out of the box. We evaluate the toolkit on Wikidata and Freebase and demonstrate its ability to retrieve semantically relevant subgraphs for a given natural query.
翻译:基于知识图谱的语义分析需要大小适中的相关子图。现有方法存在三个问题阻碍了此类子图的集成。首先,缺乏现成的语义相关子图检索框架。其次,现有方法依赖特定知识图谱,导致即使最新研究中也常使用过时知识图谱。第三,现有方法在实体链接或路径扩展方面存在缺陷,常生成规模庞大的子图。本文提出SRTK——面向大规模知识图谱语义相关子图检索的用户友好型工具包。SRTK是首个实现子图检索全生命周期(从开发阶段的预处理、训练与评估,到应用阶段的实体链接、检索与可视化)的标准化工具包。此外,通过定义统一访问接口,SRTK支持Wikidata、Freebase和DBpedia三种知识图谱。同时,该工具包内置了当前最先进的子图检索算法。我们在Wikidata和Freebase上对工具包进行了评估,结果表明其能够针对自然语言查询检索出语义相关的子图。