We are interested in computing an approximation of the maximum flow in large (brain) connectivity networks. The maximum flow in such networks is of interest in order to better understand the routing of information in the human brain. However, the runtime of $O(|V||E|^2)$ for the classic Edmonds-Karp algorithm renders computations of the maximum flow on networks with millions of vertices infeasible, where $V$ is the set of vertices and $E$ is the set of edges. In this contribution, we propose a new Monte Carlo algorithm which is capable of computing an approximation of the maximum flow in networks with millions of vertices via subsampling. Apart from giving a point estimate of the maximum flow, our algorithm also returns valid confidence bounds for the true maximum flow. Importantly, its runtime only scales as $O(B \cdot |\tilde{V}| |\tilde{E}|^2)$, where $B$ is the number of Monte Carlo samples, $\tilde{V}$ is the set of subsampled vertices, and $\tilde{E}$ is the edge set induced by $\tilde{V}$. Choosing $B \in O(|V|)$ and $|\tilde{V}| \in O(\sqrt{|V|})$ (implying $|\tilde{E}| \in O(|V|)$) yields an algorithm with runtime $O(|V|^{3.5})$ while still guaranteeing the usual "root-n" convergence of the confidence interval of the maximum flow estimate. We evaluate our proposed algorithm with respect to both accuracy and runtime on simulated graphs as well as graphs downloaded from the Brain Networks Data Repository (https://networkrepository.com).
翻译:我们致力于在大规模(脑)连接网络中计算最大流的近似值。此类网络中的最大流对于更好地理解人脑信息路由具有重要意义。然而,经典Edmonds-Karp算法的$O(|V||E|^2)$时间复杂度使得在具有数百万顶点的网络上计算最大流不可行,其中$V$为顶点集,$E$为边集。本研究提出一种新的蒙特卡洛算法,能够通过子采样计算具有数百万顶点网络的最大流近似值。该算法不仅能给出最大流的点估计,还能返回真实最大流的有效置信边界。其时间复杂度仅按$O(B \cdot |\tilde{V}| |\tilde{E}|^2)$缩放,其中$B$为蒙特卡洛采样次数,$\tilde{V}$为子采样顶点集,$\tilde{E}$为$\tilde{V}$诱导的边集。选择$B \in O(|V|)$且$|\tilde{V}| \in O(\sqrt{|V|})$(意味着$|\tilde{E}| \in O(|V|)$)可获得时间复杂度为$O(|V|^{3.5})$的算法,同时仍能保证最大流估计置信区间具有常规的"根号n"收敛性。我们在模拟图及从脑网络数据仓库(https://networkrepository.com)下载的图上,从精度和运行时间两方面评估了所提算法。