Estimating the structure of directed acyclic graphs (DAGs) of features (variables) plays a vital role in revealing the latent data generation process and providing causal insights in various applications. Although there have been many studies on structure learning with various types of data, the structure learning on the dynamic graph has not been explored yet, and thus we study the learning problem of node feature generation mechanism on such ubiquitous dynamic graph data. In a dynamic graph, we propose to simultaneously estimate contemporaneous relationships and time-lagged interaction relationships between the node features. These two kinds of relationships form a DAG, which could effectively characterize the feature generation process in a concise way. To learn such a DAG, we cast the learning problem as a continuous score-based optimization problem, which consists of a differentiable score function to measure the validity of the learned DAGs and a smooth acyclicity constraint to ensure the acyclicity of the learned DAGs. These two components are translated into an unconstraint augmented Lagrangian objective which could be minimized by mature continuous optimization techniques. The resulting algorithm, named GraphNOTEARS, outperforms baselines on simulated data across a wide range of settings that may encounter in real-world applications. We also apply the proposed approach on two dynamic graphs constructed from the real-world Yelp dataset, demonstrating our method could learn the connections between node features, which conforms with the domain knowledge.
翻译:估计特征(变量)间有向无环图(DAG)的结构,在揭示潜在数据生成过程以及为各类应用提供因果洞察方面发挥着至关重要的作用。尽管已有大量研究针对不同类型数据进行了结构学习,但动态图上的结构学习尚未被探索。因此,我们研究此类普遍存在的动态图数据上节点特征生成机制的学习问题。在动态图中,我们提出同时估计节点特征间的同期关系与时滞交互关系。这两类关系构成一个有向无环图,能够以简洁的方式有效刻画特征生成过程。为学习此类DAG,我们将学习问题转化为基于连续分数的优化问题,该问题包含一个可微分数函数(用于衡量所学DAG的有效性)和一个平滑无环约束(用于确保所学DAG的无环性)。这两个组成部分被转换为无约束的增广拉格朗日目标函数,可通过成熟的连续优化技术进行最小化。所提出的算法(命名为GraphNOTEARS)在模拟数据上(涵盖实际应用中可能遇到的多种设置)的性能优于基线方法。我们还将所提方法应用于从真实Yelp数据集构建的两个动态图,结果表明我们的方法能够学习节点特征间的关联,且这些关联符合领域知识。