Dense subgraph discovery is a fundamental primitive in graph and hypergraph analysis which among other applications has been used for real-time story detection on social media and improving access to data stores of social networking systems. We present several contributions for localized densest subgraph discovery, which seeks dense subgraphs located nearby given seed sets of nodes. We first introduce a generalization of a recent $\textit{anchored densest subgraph}$ problem, extending this previous objective to hypergraphs and also adding a tunable locality parameter that controls the extent to which the output set overlaps with seed nodes. Our primary technical contribution is to prove when it is possible to obtain a strongly-local algorithm for solving this problem, meaning that the runtime depends only on the size of the input set. We provide a strongly-local algorithm that applies whenever the locality parameter is not too small, and show via counterexample why strongly-local algorithms are impossible below a certain threshold. Along the way to proving our results for localized densest subgraph discovery, we also provide several advances in solving global dense subgraph discovery objectives. This includes the first strongly polynomial time algorithm for the densest supermodular set problem and a flow-based exact algorithm for a heavy and dense subgraph discovery problem in graphs with arbitrary node weights. We demonstrate our algorithms on several web-based data analysis tasks.
翻译:密集子图发现是图与超图分析中的基础性操作,已被应用于社交媒体实时事件检测及社交网络系统数据存储访问优化等领域。针对给定种子节点集的局部密集子图发现问题,我们提出了多项贡献。首先,我们推广了近期提出的锚定最密集子图问题,将该目标扩展至超图场景,并引入可调局部性参数以控制输出集与种子节点的重叠程度。我们的主要技术贡献在于证明在何种条件下可获得该问题的强局部算法(即运行时间仅取决于输入集规模)。我们提出了一种适用于局部性参数不过小情况的强局部算法,并通过反例证明当参数低于某阈值时强局部算法不可行。在推导局部最密集子图发现结论的过程中,我们还在全局密集子图发现目标求解方面取得了多项进展,包括:首个针对最密集超模集问题的强多项式时间算法,以及针对任意节点权重图中重密集子图发现问题的基于流精确算法。我们通过多项网络数据分析任务验证了所提算法的有效性。