Virtual Knowledge Graphs (VKG) constitute one of the most promising paradigms for integrating and accessing legacy data sources. A critical bottleneck in the integration process involves the definition, validation, and maintenance of mappings that link data sources to a domain ontology. To support the management of mappings throughout their entire lifecycle, we propose a comprehensive catalog of sophisticated mapping patterns that emerge when linking databases to ontologies. To do so, we build on well-established methodologies and patterns studied in data management, data analysis, and conceptual modeling. These are extended and refined through the analysis of concrete VKG benchmarks and real-world use cases, and considering the inherent impedance mismatch between data sources and ontologies. We validate our catalog on the considered VKG scenarios, showing that it covers the vast majority of patterns present therein.
翻译:虚拟知识图谱(VKG)是实现遗留数据源集成与访问的最具前景的范式之一。集成过程中的关键瓶颈在于定义、验证和维护将数据源与领域本体相链接的映射。为支持映射在其全生命周期中的管理,我们提出了一份综合性目录,收录了在链接数据库与本体时出现的复杂映射模式。为此,我们基于数据管理、数据分析和概念建模领域已成熟的方法论与模式,通过分析具体的VKG基准测试和真实应用场景,并考虑数据源与本体之间的固有阻抗不匹配问题,对这些模式进行了扩展与精炼。我们在所考虑的VKG场景中验证了该目录,结果表明其覆盖了其中绝大多数映射模式。