This work initiates the study of memory-query tradeoffs for graph problems, with a focus on correlation clustering. Correlation clustering asks for a partition of the vertices that minimizes disagreements: non-edges inside clusters plus edges across clusters. Our first result is a tight query lower bound: to output a partition whose cost approximates the optimum up to an additive error of $\varepsilon n^2$, any algorithm requires $Ω(n/\varepsilon^2)$ adjacency-matrix queries. Under memory constraints, we show that even for the seemingly easier task of approximating the optimal clustering cost (without producing a partition), any algorithm in the random query model must make $\gg n/\varepsilon^2$ adjacency-matrix queries. Finally, we prove the first general graph model query lower bound for correlation clustering, where algorithms are allowed adjacency-matrix, neighbor, and degree queries. The latter two bounds are not yet tight, leaving room for sharper results.
翻译:本文率先研究了图问题中内存与查询量的权衡关系,重点关注相关聚类问题。相关聚类要求将顶点划分为若干簇,以最小化不一致情况:簇内非边与簇间边之和。我们的首个结果是紧致的查询下界:若需输出一个成本在$\varepsilon n^2$加性误差内近似最优的划分,任何算法均需$\Omega(n/\varepsilon^2)$次邻接矩阵查询。在内存约束下,我们证明即便对于看似更简单的任务——近似最优聚类成本(无需生成划分),随机查询模型中的任何算法均需执行$\gg n/\varepsilon^2$次邻接矩阵查询。最后,我们证明了首个针对相关聚类的通用图模型查询下界,其中算法允许使用邻接矩阵、邻居及度数查询。后两项下界尚未达到紧致,为获得更精确的结果留下了空间。