In applications of group testing in networks, e.g. identifying individuals who are infected by a disease spread over a network, exploiting correlation among network nodes provides fundamental opportunities in reducing the number of tests needed. We model and analyze group testing on $n$ correlated nodes whose interactions are specified by a graph $G$. We model correlation through an edge-faulty random graph formed from $G$ in which each edge is dropped with probability $1-r$, and all nodes in the same component have the same state. We consider three classes of graphs: cycles and trees, $d$-regular graphs and stochastic block models or SBM, and obtain lower and upper bounds on the number of tests needed to identify the defective nodes. Our results are expressed in terms of the number of tests needed when the nodes are independent and they are in terms of $n$, $r$, and the target error. In particular, we quantify the fundamental improvements that exploiting correlation offers by the ratio between the total number of nodes $n$ and the equivalent number of independent nodes in a classic group testing algorithm. The lower bounds are derived by illustrating a strong dependence of the number of tests needed on the expected number of components. In this regard, we establish a new approximation for the distribution of component sizes in "$d$-regular trees" which may be of independent interest and leads to a lower bound on the expected number of components in $d$-regular graphs. The upper bounds are found by forming dense subgraphs in which nodes are more likely to be in the same state. When $G$ is a cycle or tree, we show an improvement by a factor of $log(1/r)$. For grid, a graph with almost $2n$ edges, the improvement is by a factor of ${(1-r) \log(1/r)}$, indicating drastic improvement compared to trees. When $G$ has a larger number of edges, as in SBM, the improvement can scale in $n$.
翻译:在网络群组测试应用中(例如识别通过社交网络传播的疾病感染者),利用网络节点之间的相关性为减少所需测试次数提供了根本性机遇。本文对由图$G$指定交互关系的$n$个相关节点上的群组测试进行建模与分析。我们通过从$G$构建边故障随机图来建模相关性:每条边以$1-r$的概率丢失,且同一连通分量内所有节点具有相同状态。考虑三类图结构:环与树、$d$-正则图以及随机块模型(SBM),我们推导出识别缺陷节点所需测试次数的上下界。结果以节点独立时所需测试次数表示,并关联参数$n$、$r$及目标误差。具体而言,通过总节点数$n$与经典群组测试算法中等效独立节点数的比值,量化利用相关性带来的根本性改进。下界推导揭示了所需测试次数与期望分量数之间的强依赖关系,为此我们建立了"$d$-正则树"中分量大小分布的新近似(该结果可能具有独立研究价值),并进一步导出$d$-正则图中期望分量数的下界。上界通过构建节点更可能处于相同状态的稠密子图获得。当$G$为环或树时,改进因子为$\log(1/r)$;对于具有近$2n$条边的网格图,改进因子为${(1-r) \log(1/r)}$,表明相较于树结构的显著提升。当$G$具有更多边(如SBM)时,改进效果可随$n$扩展。