CP decomposition is a powerful tool for data science, especially gene analysis, deep learning, and quantum computation. However, the application of tensor decomposition is largely hindered by the exponential increment of the computational complexity and storage consumption with the size of tensors. While the data in our real world is usually presented as trillion- or even exascale-scale tensors, existing work can only support billion-scale scale tensors. In our work, we propose the Exascale-Tensor to mitigate the significant gap. Specifically, we propose a compression-based tensor decomposition framework, namely the exascale-tensor, to support exascale tensor decomposition. Then, we carefully analyze the inherent parallelism and propose a bag of strategies to improve computational efficiency. Last, we conduct experiments to decompose tensors ranging from million-scale to trillion-scale for evaluation. Compared to the baselines, the exascale-tensor supports 8,000x larger tensors and a speedup up to 6.95x. We also apply our method to two real-world applications, including gene analysis and tensor layer neural networks, of which the numeric results demonstrate the scalability and effectiveness of our method.
翻译:CP分解是数据科学(尤其是基因分析、深度学习与量子计算)的重要工具。然而,张量分解的应用因计算复杂度与存储消耗随张量规模呈指数级增长而受到极大限制。尽管现实世界的数据通常以万亿级甚至百亿亿级张量的形式呈现,现有工作仅能支持十亿级规模的张量。本研究提出Exascale-Tensor以弥合这一显著差距。具体而言,我们提出基于压缩的张量分解框架(即Exascale-Tensor),支持百亿亿级张量分解。随后,我们深入分析其内在并行性,并提出一系列提升计算效率的策略。最后,我们通过从百万级到万亿级张量的分解实验进行评测。相较于基线方法,Exascale-Tensor支持处理规模扩大8000倍的张量,并实现最高6.95倍的加速。我们还将该方法应用于基因分析与张量层神经网络两个真实场景,数值结果验证了该方法的可扩展性与有效性。