SAGE: A Storage-Based Approach for Scalable and Efficient Sparse Generalized Matrix-Matrix Multiplication

Sparse generalized matrix-matrix multiplication (SpGEMM) is a fundamental operation for real-world network analysis. With the increasing size of real-world networks, the single-machine-based SpGEMM approach cannot perform SpGEMM on large-scale networks, exceeding the size of main memory (i.e., not scalable). Although the distributed-system-based approach could handle large-scale SpGEMM based on multiple machines, it suffers from severe inter-machine communication overhead to aggregate results of multiple machines (i.e., not efficient). To address this dilemma, in this paper, we propose a novel storage-based SpGEMM approach (SAGE) that stores given networks in storage (e.g., SSD) and loads only the necessary parts of the networks into main memory when they are required for processing via a 3-layer architecture. Furthermore, we point out three challenges that could degrade the overall performance of SAGE and propose three effective strategies to address them: (1) block-based workload allocation for balancing workloads across threads, (2) in-memory partial aggregation for reducing the amount of unnecessarily generated storage-memory I/Os, and (3) distribution-aware memory allocation for preventing unexpected buffer overflows in main memory. Via extensive evaluation, we verify the superiority of SAGE over existing SpGEMM methods in terms of scalability and efficiency.

翻译：稀疏广义矩阵乘法（SpGEMM）是真实世界网络分析中的基础运算。随着网络规模不断扩大，基于单机的SpGEMM方法无法处理超出主存容量的大规模网络（即不具备可扩展性）。尽管基于分布式系统的方法可通过多机处理大规模SpGEMM，但这类方法在聚合多机结果时存在严重的机器间通信开销（即效率低下）。为解决这一困境，本文提出一种新颖的基于存储的SpGEMM方法（SAGE），该方法将给定网络存储在存储设备（如SSD）中，通过三层架构仅在处理所需时加载网络必要部分至主存。此外，我们指出可能降低SAGE整体性能的三个挑战，并提出三项有效应对策略：（1）基于分块的工作负载分配，以实现线程间负载均衡；（2）内存内部分聚合，以减少不必要生成的存储-内存I/O；（3）分布感知内存分配，以防止主存中意外的缓冲区溢出。通过全面实验评估，我们验证了SAGE在可扩展性和效率方面相较于现有SpGEMM方法的优越性。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日