Dense subgraph discovery is a fundamental primitive in graph and hypergraph analysis which among other applications has been used for real-time story detection on social media and improving access to data stores of social networking systems. We present several contributions for localized densest subgraph discovery, which seeks dense subgraphs located nearby a given seed sets of nodes. We first introduce a generalization of a recent $\textit{anchored densest subgraph}$ problem, extending this previous objective to hypergraphs and also adding a tunable locality parameter that controls the extent to which the output set overlaps with seed nodes. Our primary technical contribution is to prove when it is possible to obtain a strongly-local algorithm for solving this problem, meaning that the runtime depends only on the size of the input set. We provide a strongly-local algorithm that applies whenever the locality parameter is at least 1, and show why via counterexample that strongly-local algorithms are impossible below this threshold. Along the way to proving our results for localized densest subgraph discovery, we also provide several advances in solving global dense subgraph discovery objectives. This includes the first strongly polynomial time algorithm for the densest supermodular set problem and a flow-based exact algorithm for a densest subgraph discovery problem in graphs with arbitrary node weights. We demonstrate the utility of our algorithms on several web-based data analysis tasks.
翻译:密集子图发现是图和超图分析中的基本原语,已应用于社交媒体实时故事检测及社交网络系统数据存储访问优化等任务。本文针对局部化最密集子图发现提出多项贡献,该问题旨在寻找给定种子节点集附近的高密度子图。首先,我们推广了近期提出的$\textit{锚定最密集子图}$问题,将该目标函数扩展到超图,并引入可调局部性参数以控制输出集与种子节点的重叠程度。主要技术贡献在于证明何时能获得该问题的强局部算法(即运行时间仅取决于输入集规模):当局部性参数至少为1时,我们提供强局部算法,并通过反例证明低于该阈值时强局部算法不可能存在。在推导局部化最密集子图发现结果的过程中,我们还推进了全局密集子图发现目标的求解:包括首个用于最密集超模集问题的强多项式时间算法,以及针对任意节点权重图中最密集子图发现问题的基于流的精确算法。最后通过多个基于网络的数据分析任务验证了算法的实用性。