Tensor Train~(TT) decomposition is widely used in the machine learning and quantum physics communities as a popular tool to efficiently compress high-dimensional tensor data. In this paper, we propose an efficient algorithm to accelerate computing the TT decomposition with the Alternating Least Squares (ALS) algorithm relying on exact leverage scores sampling. For this purpose, we propose a data structure that allows us to efficiently sample from the tensor with time complexity logarithmic in the tensor size. Our contribution specifically leverages the canonical form of the TT decomposition. By maintaining the canonical form through each iteration of ALS, we can efficiently compute (and sample from) the leverage scores, thus achieving significant speed-up in solving each sketched least-square problem. Experiments on synthetic and real data on dense and sparse tensors demonstrate that our method outperforms SVD-based and ALS-based algorithms.
翻译:张量链(TT)分解作为高效压缩高维张量数据的常用工具,在机器学习和量子物理领域被广泛使用。本文提出一种高效算法,通过基于精确杠杆值采样的交替最小二乘(ALS)算法来加速TT分解的计算。为此,我们设计了一种数据结构,使得从张量中采样的时间复杂度为张量大小的对数级别。我们的贡献特别利用了TT分解的正则形式。通过在ALS的每次迭代中保持正则形式,我们能够高效计算(并从其中采样)杠杆值,从而在求解每个草图最小二乘问题时实现显著加速。在稠密与稀疏张量的合成数据及真实数据上的实验表明,本方法优于基于奇异值分解(SVD)和基于ALS的算法。