Graphs play an increasingly important role in various big data applications. However, existing graph data structures cannot simultaneously address the performance bottlenecks caused by the dynamic updates, large scale, and high query complexity of current graphs. This paper proposes a novel data structure for large-scale dynamic graphs called CuckooGraph. It does not require any prior knowledge of the upcoming graphs, and can adaptively resize to the most memory-efficient form while requiring few memory accesses for very fast graph data processing. The key techniques of CuckooGraph include TRANSFORMATION and DENYLIST. TRANSFORMATION fully utilizes the limited memory by designing related data structures that allow flexible space transformations to smoothly expand/tighten the required space depending on the number of incoming items. DENYLIST efficiently handles item insertion failures and further improves processing speed. Our experimental results show that compared with the most competitive solution Spruce, CuckooGraph achieves about $33\times$ higher insertion throughput while requiring only about $68\%$ of the memory space.
翻译:图在各种大数据应用中扮演着日益重要的角色。然而,现有图数据结构无法同时应对当前图数据因动态更新、规模庞大和查询复杂度高而导致的性能瓶颈。本文提出了一种名为CuckooGraph的新型大规模动态图数据结构。它无需任何关于待处理图的先验知识,能够自适应地调整至内存效率最高的形式,同时仅需极少的内存访问即可实现极快的图数据处理。CuckooGraph的核心技术包括TRANSFORMATION和DENYLIST。TRANSFORMATION通过设计相关的数据结构,充分利用有限的内存,实现灵活的空间变换,从而根据输入项目的数量平滑地扩展或紧缩所需空间。DENYLIST则高效处理项目插入失败的情况,并进一步提升处理速度。我们的实验结果表明,与最具竞争力的解决方案Spruce相比,CuckooGraph实现了约$33\times$的插入吞吐量提升,同时仅需约$68\%$的内存空间。