We propose two provably accurate methods for low CP-rank tensor completion - one using adaptive sampling and one using nonadaptive sampling. Both of our algorithms combine matrix completion techniques for a small number of slices along with Jennrich's algorithm to learn the factors corresponding to the first two modes, and then solve systems of linear equations to learn the factors corresponding to the remaining modes. For order-$3$ tensors, our algorithms follow a "sandwich" sampling strategy that more densely samples a few outer slices (the bread), and then more sparsely samples additional inner slices (the bbq-braised tofu) for the final completion. For an order-$d$, CP-rank $r$ tensor of size $n \times \cdots \times n$ that satisfies mild assumptions, our adaptive sampling algorithm recovers the CP-decomposition with high probability while using at most $O(nr\log r + dnr)$ samples and $O(n^2r^2+dnr^2)$ operations. Our nonadaptive sampling algorithm recovers the CP-decomposition with high probability while using at most $O(dnr^2\log n + nr\log^2 n)$ samples and runs in polynomial time. Numerical experiments demonstrate that both of our methods work well on noisy synthetic data as well as on real world data.
翻译:我们提出了两种可证明精确的低CP秩张量补全方法——一种采用自适应采样,另一种采用非自适应采样。两种算法均将少量切片的矩阵补全技术与Jennrich算法相结合,以学习前两个模态对应的因子,随后通过求解线性方程组来学习其余模态对应的因子。对于三阶张量,我们的算法遵循“三明治”采样策略:较密集地采样少数外层切片(“面包”),并更稀疏地采样额外内层切片(“烧烤炖豆腐”)以完成最终补全。对于满足温和假设的尺寸为$n \times \cdots \times n$、CP秩为$r$的$d$阶张量,我们的自适应采样算法能以高概率恢复CP分解,且最多使用$O(nr\log r + dnr)$个样本和$O(n^2r^2+dnr^2)$次运算。我们的非自适应采样算法能以高概率恢复CP分解,且最多使用$O(dnr^2\log n + nr\log^2 n)$个样本,并在多项式时间内运行。数值实验表明,两种方法在含噪合成数据及真实世界数据上均表现良好。