The tensor programming abstraction has become the key . This framework allows users to write high performance programs for bulk computation via a high-level imperative interface. Recent work has extended this paradigm to sparse tensors (i.e. tensors where most entries are not explicitly represented) with the use of sparse tensor compilers. These systems excel at producing efficient code for computation over sparse tensors, which may be stored in a wide variety of formats. However, they require the user to manually choose the order of operations and the data formats at every step. Unfortunately, these decisions are both highly impactful and complicated, requiring significant effort to manually optimize. In this work, we present Galley, a system for declarative sparse tensor programming. Galley performs cost-based optimization to lower these programs to a logical plan then to a physical plan. It then leverages sparse tensor compilers to execute the physical plan efficiently. We show that Galley achieves high performance on a wide variety of problems including machine learning algorithms, subgraph counting, and iterative graph algorithms.
翻译:张量编程抽象已成为关键。该框架允许用户通过高级命令式接口编写用于批量计算的高性能程序。近期研究通过稀疏张量编译器的应用,将这一范式扩展至稀疏张量(即大多数条目未显式表示的张量)。这些系统擅长为稀疏张量计算生成高效代码,这些张量可能以多种格式存储。然而,它们要求用户在每一步手动选择操作顺序和数据格式。遗憾的是,这些决策不仅影响巨大且极为复杂,需要大量人工优化工作。本研究提出Galley——一个声明式稀疏张量编程系统。Galley执行基于代价的优化,将这些程序逐步降级为逻辑计划,再转换为物理计划。随后利用稀疏张量编译器高效执行物理计划。我们证明Galley在包括机器学习算法、子图计数和迭代图算法在内的多种问题上均能实现高性能。