The rapid growth in the size of deep learning models strains the capabilities of traditional dense computation paradigms. Leveraging sparse computation has become increasingly popular for training and deploying large-scale models, but existing deep learning frameworks lack extensive support for sparse operations. To bridge this gap, we introduce Scorch, a library that seamlessly integrates efficient sparse tensor computation into the PyTorch ecosystem, with an initial focus on inference workloads on CPUs. Scorch provides a flexible and intuitive interface for sparse tensors, supporting diverse sparse data structures. Scorch introduces a compiler stack that automates key optimizations, including automatic loop ordering, tiling, and format inference. Combined with a runtime that adapts its execution to both dense and sparse data, Scorch delivers substantial speedups over hand-written PyTorch Sparse (torch.sparse) operations without sacrificing usability. More importantly, Scorch enables efficient computation of complex sparse operations that lack hand-optimized PyTorch implementations. This flexibility is crucial for exploring novel sparse architectures. We demonstrate Scorch's ease of use and performance gains on diverse deep learning models across multiple domains. With only minimal code changes, Scorch achieves 1.05-5.78x speedups over PyTorch Sparse on end-to-end tasks. Scorch's seamless integration and performance gains make it a valuable addition to the PyTorch ecosystem. We believe Scorch will enable wider exploration of sparsity as a tool for scaling deep learning and inform the development of other sparse libraries.
翻译:深度学习模型规模的快速增长对传统密集计算范式的能力提出了挑战。利用稀疏计算来训练和部署大规模模型已日益普及,但现有的深度学习框架普遍缺乏对稀疏操作的广泛支持。为弥补这一差距,我们推出了Scorch,这是一个将高效稀疏张量计算无缝集成到PyTorch生态系统中的库,其初期重点在于CPU上的推理工作负载。Scorch为稀疏张量提供了灵活直观的接口,支持多种稀疏数据结构。Scorch引入了一个编译器栈,可自动化关键优化,包括自动循环排序、分块和格式推断。结合一个能够根据密集和稀疏数据自适应调整执行的运行时,Scorch在保持易用性的同时,相比手写的PyTorch Sparse(torch.sparse)操作实现了显著的加速。更重要的是,Scorch能够高效计算那些缺乏手写优化PyTorch实现的复杂稀疏操作。这种灵活性对于探索新颖的稀疏架构至关重要。我们在多个领域的多种深度学习模型上展示了Scorch的易用性和性能提升。仅需极少的代码修改,Scorch在端到端任务上即可实现相比PyTorch Sparse 1.05至5.78倍的加速。Scorch的无缝集成和性能提升使其成为PyTorch生态系统中一个宝贵的补充。我们相信Scorch将推动更广泛地探索稀疏性作为扩展深度学习的工具,并为其他稀疏库的开发提供参考。