The advent of Graph Neural Networks (GNNs) has revolutionized the field of machine learning, offering a novel paradigm for learning on graph-structured data. Unlike traditional neural networks, GNNs are capable of capturing complex relationships and dependencies inherent in graph data, making them particularly suited for a wide range of applications including social network analysis, molecular chemistry, and network security. GNNs, with their unique structure and operation, present new computational challenges compared to conventional neural networks. This requires comprehensive benchmarking and a thorough characterization of GNNs to obtain insight into their computational requirements and to identify potential performance bottlenecks. In this thesis, we aim to develop a better understanding of how GNNs interact with the underlying hardware and will leverage this knowledge as we design specialized accelerators and develop new optimizations, leading to more efficient and faster GNN computations. A pivotal component within GNNs is the Sparse General Matrix-Matrix Multiplication (SpGEMM) kernel, known for its computational intensity and irregular memory access patterns. In this thesis, we address the challenges posed by SpGEMM by implementing a highly optimized hashing-based SpGEMM kernel tailored for a custom accelerator. Synthesizing these insights and optimizations, we design state-of-the-art hardware accelerators capable of efficiently handling various GNN workloads. Our accelerator architectures are built on our characterization of GNN computational demands, providing clear motivation for our approaches. This exploration into novel models underlines our comprehensive approach, as we strive to enable accelerators that are not just performant, but also versatile, able to adapt to the evolving landscape of graph computing.
翻译:图神经网络(GNN)的出现彻底改变了机器学习领域,为图结构数据的学习提供了全新范式。与传统神经网络不同,GNN能够捕捉图数据中固有的复杂关系与依赖,使其特别适用于社交网络分析、分子化学和网络安全等广泛应用。GNN凭借其独特的结构与运算方式,相比传统神经网络带来了新的计算挑战。这需要对GNN进行全面的基准测试和深入的特征分析,以洞察其计算需求并识别潜在的性能瓶颈。本论文旨在深入理解GNN与底层硬件的交互机制,并利用这一认知设计专用加速器与开发新型优化技术,从而实现更高效、更快速的GNN计算。GNN中的核心组件之一是稀疏通用矩阵乘法(SpGEMM)内核,其以高计算密集度和不规则内存访问模式著称。针对SpGEMM带来的挑战,本论文通过实现一个面向定制加速器的高度优化的哈希型SpGEMM内核加以解决。综合上述洞察与优化成果,我们设计了能够高效处理各类GNN任务的最先进硬件加速器。我们的加速器架构基于对GNN计算需求的特征分析,为设计方法提供了明确动机。这种对新型模型的探索凸显了我们的全面策略——我们致力于实现不仅高性能、且具备通用性的加速器,使其能够适应图计算领域的持续演进。