The tensor-train (TT) format is a data-sparse tensor representation commonly used in high dimensional function approximations arising from computational and data sciences. Various sequential and parallel TT decomposition algorithms have been proposed for different tensor inputs and assumptions. In this paper, we propose subtensor parallel adaptive TT cross, which partitions a tensor onto distributed memory machines with multidimensional process grids, and constructs an TT approximation iteratively with tensor elements. We derive two iterative formulations for pivot selection and TT core construction under the distributed memory setting, conduct communication and scaling analysis of the algorithm, and illustrate its performance with multiple test experiments. These include up to 6D Hilbert tensors and tensors constructed from Maxwellian distribution functions that arise in kinetic theory. Our results demonstrate significant accuracy with greatly reduced storage requirements via the TT cross approximation. Furthermore, we demonstrate good to optimal strong and weak scaling performance for the proposed parallel algorithm.
翻译:张量列车(TT)格式是一种数据稀疏的张量表示方法,常用于计算科学与数据科学中出现的高维函数逼近。针对不同的张量输入和假设,已提出多种串行与并行TT分解算法。本文提出子张量并行自适应TT交叉方法,该方法将张量划分到具有多维进程网格的分布式内存机器上,并利用张量元素迭代构建TT逼近。我们推导了分布式内存环境下枢轴选择与TT核心构建的两种迭代公式,对算法进行了通信与可扩展性分析,并通过多组测试实验展示了其性能。测试案例包括高达六维的希尔伯特张量以及由动理学理论中出现的麦克斯韦分布函数构造的张量。实验结果表明,通过TT交叉逼近方法在显著降低存储需求的同时仍能保持较高的精度。此外,我们证明了所提并行算法具有良好至最优的强可扩展性与弱可扩展性表现。