Query-Centered Temporal Community Search via Time-Constrained Personalized PageRank

Existing temporal community search suffers from two defects: (i) they ignore the temporal proximity between the query vertex $q$ and other vertices but simply require the result to include $q$. Thus, they find many temporal irrelevant vertices (these vertices are called \emph{query-drifted vertices}) to $q$ for satisfying their cohesiveness, resulting in $q$ being marginalized; (ii) their methods are NP-hard, incurring high costs for exact solutions or compromised qualities for approximate/heuristic algorithms. Inspired by these, we propose a novel problem named \emph{query-centered} temporal community search to circumvent \emph{query-drifted vertices}. Specifically, we first present a novel concept of Time-Constrained Personalized PageRank to characterize the temporal proximity between $q$ and other vertices. Then, we introduce a model called $\beta$-temporal proximity core, which can combine temporal proximity and structural cohesiveness. Subsequently, our problem is formulated as an optimization task that finds a $\beta$-temporal proximity core with the largest $\beta$. To solve our problem, we first devise an exact and near-linear time greedy removing algorithm that iteratively removes unpromising vertices. To improve efficiency, we then design an approximate two-stage local search algorithm with bound-based pruning techniques. Finally, extensive experiments on eight real-life datasets and nine competitors show the superiority of the proposed solutions.

翻译：摘要：现有时序社区搜索存在两个缺陷：（i）它们忽略查询顶点$q$与其他顶点之间的时间邻近性，仅要求结果包含$q$。因此，为了满足社区凝聚性，它们会找到许多与$q$时间无关的顶点（称为\emph{查询漂移顶点}），导致$q$被边缘化；（ii）其方法均为NP难问题，精确求解代价高昂，近似/启发式算法质量受限。受此启发，我们提出名为\emph{查询中心式}时序社区搜索的新问题以避免\emph{查询漂移顶点}。具体而言，我们首先提出时间约束个性化PageRank的新概念以刻画$q$与其他顶点的时间邻近性，随后引入$\beta$-时间邻近核模型以融合时间邻近性与结构凝聚性，进而将问题形式化为寻找最大$\beta$值的$\beta$-时间邻近核的优化任务。为解决该问题，我们设计了精确且近线性时间复杂度的贪婪删除算法，通过迭代移除无前景顶点；为提升效率，进一步提出基于边界剪枝技术的近似两阶段局部搜索算法。最后，在八个真实数据集和九个对比方法上的实验表明所提方案具有显著优越性。

相关内容

PageRank

关注 210

PageRank，网页排名，又称网页级别、Google左侧排名或佩奇排名，是一种由[1] 根据网页之间相互的超链接计算的技术，而作为网页排名的要素之一，以Google公司创办人拉里·佩奇（Larry Page）之姓来命名。Google用它来体现网页的相关性和重要性，在搜索引擎优化操作中是经常被用来评估网页优化的成效因素之一。Google的创始人拉里·佩奇和谢尔盖·布林于1998年在斯坦福大学发明了这项技术。

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日