Querying cohesive subgraphs on temporal graphs with various time constraints has attracted intensive research interests recently. In this paper, we study a novel Temporal k-Core Query (TCQ) problem: given a time interval, find all distinct k-cores that exist within any subintervals from a temporal graph, which generalizes the previous historical k-core query. This problem is challenging because the number of subintervals increases quadratically to the span of time interval. For that, we propose a novel Temporal Core Decomposition (TCD) algorithm that decrementally induces temporal k-cores from the previously induced ones and thus reduces "intra-core" redundant computation significantly. Then, we introduce an intuitive concept named Tightest Time Interval (TTI) for temporal k-core, and design an optimization technique with theoretical guarantee that leverages TTI as a key to predict which subintervals will induce duplicated k-cores and prunes the subintervals completely in advance, thereby eliminating "inter-core" redundant computation. The complexity of optimized TCD (OTCD) algorithm no longer depends on the span of query time interval but only the scale of final results, which means OTCD algorithm is scalable. Moreover, we propose a compact in-memory data structure named Temporal Edge List (TEL) to implement OTCD algorithm efficiently in physical level with bounded memory requirement. TEL organizes temporal edges in a "timeline" and can be updated instantly when new edges arrive, and thus our approach can also deal with dynamic temporal graphs. We compare OTCD algorithm with the incremental historical k-core query on several real-world temporal graphs, and observe that OTCD algorithm outperforms it by three orders of magnitude, even though OTCD algorithm needs none precomputed index.
翻译:查询具有各种时间约束的时序图上的凝聚子图最近引起了广泛的研究兴趣。本文研究了一种新的时序k-核查询问题:给定一个时间区间,从时序图中找出存在于任何子区间内的所有不同的k-核,这推广了以往的历史k-核查询。该问题的挑战在于子区间的数量与时间区间的跨度呈二次增长。为此,我们提出了一种新颖的时序核分解算法,该算法从先前推导出的时序k-核中递减地诱导出新的时序k-核,从而显著减少了“核内”冗余计算。接着,我们引入了一个直观的概念——时序k-核的最紧时间区间,并设计了一种具有理论保证的优化技术,利用TTI作为关键指标来预测哪些子区间会产生重复的k-核,并提前完全剪枝这些子区间,从而消除了“核间”冗余计算。优化后的时序核分解算法复杂度不再依赖于查询时间区间的跨度,而仅取决于最终结果规模,这意味着OTCD算法是可扩展的。此外,我们提出了一种紧凑的内存数据结构——时序边列表,用于在物理层面高效实现OTCD算法,并且内存需求有界。TEL以“时间线”方式组织时序边,并在新边到达时能够即时更新,因此我们的方法还能处理动态时序图。我们将OTCD算法与增量式历史k-核查询在多个真实时序图上进行比较,观察到即使OTCD算法无需任何预计算索引,其性能仍优于后者三个数量级。