We introduce EdgeSketch, a compact graph representation for efficient analysis of massive graph streams. EdgeSketch provides unbiased estimators for key graph properties with controllable variance and supports implementing graph algorithms on the stored summary directly. It is constructed in a fully streaming manner, requiring a single pass over the edge stream, while offline analysis relies solely on the sketch. We evaluate the proposed approach on two representative applications: community detection via the Louvain method and graph reconstruction through node similarity estimation. Experiments demonstrate substantial memory savings and runtime improvements over both lossless representations and prior sketching approaches, while maintaining reliable accuracy.
翻译:本文提出EdgeSketch,一种用于高效分析海量图流的紧凑图表示方法。EdgeSketch能够为关键图属性提供方差可控的无偏估计量,并支持直接在存储的摘要上实现图算法。该方法以完全流式方式构建,仅需对边流进行一次遍历,而离线分析完全依赖草图本身。我们在两个代表性应用上评估了所提方法:基于Louvain方法的社区发现和通过节点相似度估计的图重构。实验结果表明,相较于无损表示方法和先前的草图技术,本方法在保持可靠准确性的同时,显著节省了内存并提升了运行效率。