Graphs are expressive abstractions representing more effectively relationships in data and enabling data science tasks. They are also a widely adopted paradigm in causal inference focusing on causal directed acyclic graphs. Causal DAGs (Directed Acyclic Graphs) are manually curated by domain experts, but they are never validated, stored and integrated as data artifacts in a graph data management system. In this paper, we delineate our vision to align these two paradigms, namely causal analysis and property graphs, the latter being the cornerstone of modern graph databases. To articulate this vision, a paradigm shift is required leading to rethinking property graph data models with hypernodes and structural equations, graph query semantics and query constructs, and the definition of graph views to account for causality operators. Moreover, several research problems and challenges arise aiming at automatically extracting causal models from the underlying graph observational data, aligning and integrating disparate causal graph models into unified ones along with their maintenance upon the changes in the underlying data. The above vision will allow to make graph databases aware of causal knowledge and pave the way to data-driven personalized decision-making in several scientific fields.
翻译:图是一种表达力强的抽象表示,能更有效地体现数据中的关系并支持数据科学任务。在因果推断领域,图也是一种广泛采用的范式,主要关注因果有向无环图。因果有向无环图通常由领域专家手动构建,但从未作为数据工件在图数据管理系统中进行验证、存储和集成。本文阐述了将因果分析与属性图这两种范式相融合的愿景,其中属性图是现代图数据库的基石。为实现这一愿景,需要进行范式转变,重新思考包含超节点和结构方程的属性图数据模型、图查询语义与查询结构,以及为因果算子定义图视图。此外,还涌现出若干研究问题与挑战,旨在从底层图观测数据中自动提取因果模型,将分散的因果图模型对齐并整合为统一模型,并随着底层数据的变化对其进行维护。上述愿景将使图数据库具备因果知识感知能力,并为多个科学领域中的数据驱动个性化决策铺平道路。