Graph pattern matching is a fundamental problem encountered by many common graph mining tasks and the basic building block of several graph mining systems. This paper explores for the first time how to proactively prune graphs to speed up graph pattern matching by leveraging the structure of the query pattern and the input graph. We propose building auxiliary graphs, which are different pruned versions of the graph, during query execution. This requires careful balancing between the upfront cost of building and managing auxiliary graphs and the gains of faster set operations. To this end, we propose GraphMini, a new system that uses query compilation and a new cost model to minimize the cost of building and maintaining auxiliary graphs and maximize gains. Our evaluation shows that using GraphMini can achieve one order of magnitude speedup compared to state-of-the-art subgraph enumeration systems on commonly used benchmarks.
翻译:摘要:图模式匹配是许多常见图挖掘任务中遇到的基本问题,也是多个图挖掘系统的基础构建模块。本文首次探索如何通过利用查询模式与输入图的结构主动剪枝图,以加速图模式匹配。我们提出在查询执行过程中构建辅助图,即图的不同剪枝版本。这需要仔细权衡构建与管理辅助图的前期成本与更快速集合操作带来的收益。为此,我们提出了GraphMini,一种利用查询编译和新成本模型来最小化构建与维护辅助图成本、最大化收益的新系统。我们的评估表明,在常用基准测试上,与最先进的子图枚举系统相比,使用GraphMini可实现一个数量级的加速。