Subgraph matching is a challenging problem with a wide range of applications in database systems, biochemistry, and cognitive science. It involves determining whether a given query graph is present within a larger target graph. Traditional graph-matching algorithms provide precise results but face challenges in large graph instances due to the NP-complete problem, limiting their practical applicability. In contrast, recent neural network-based approximations offer more scalable solutions, but often lack interpretable node correspondences. To address these limitations, this article presents xNeuSM: Explainable Neural Subgraph Matching which introduces Graph Learnable Multi-hop Attention Networks (GLeMA) that adaptively learns the parameters governing the attention factor decay for each node across hops rather than relying on fixed hyperparameters. We provide a theoretical analysis establishing error bounds for GLeMA's approximation of multi-hop attention as a function of the number of hops. Additionally, we prove that learning distinct attention decay factors for each node leads to a correct approximation of multi-hop attention. Empirical evaluation on real-world datasets shows that xNeuSM achieves substantial improvements in prediction accuracy of up to 34% compared to approximate baselines and, notably, at least a seven-fold faster query time than exact algorithms. The source code of our implementation is available at https://github.com/martinakaduc/xNeuSM.
翻译:子图匹配是一个具有挑战性的问题,广泛应用于数据库系统、生物化学和认知科学领域。它需要判断给定的查询图是否存在于更大的目标图中。传统图匹配算法能提供精确结果,但由于NP完全问题,在处理大规模图实例时面临挑战,限制了其实用性。相比之下,近年基于神经网络的方法提供了更具可扩展性的解决方案,但往往缺乏可解释的节点对应关系。为解决这些局限,本文提出xNeuSM:解释性神经子图匹配,该方法引入图可学习多跳注意力网络(GLeMA),该网络能够自适应学习每个节点在各跳间控制注意力因子衰减的参数,而非依赖固定超参数。我们建立了理论分析,给出了GLeMA对多跳注意力近似误差界随跳数变化的数学表达。此外,我们证明了为每个节点学习不同的注意力衰减因子能够实现多跳注意力的正确近似。在真实数据集上的实验评估表明,与近似基线方法相比,xNeuSM的预测准确率提升高达34%,且相比精确算法,查询速度至少提升七倍。本文实现的源代码已发布于 https://github.com/martinakaduc/xNeuSM。