Tensor completion exhibits an interesting computational-statistical gap in terms of the number of samples needed to perform tensor estimation. While there are only $\Theta(tn)$ degrees of freedom in a $t$-order tensor with $n^t$ entries, the best known polynomial time algorithm requires $O(n^{t/2})$ samples in order to guarantee consistent estimation. In this paper, we show that weak side information is sufficient to reduce the sample complexity to $O(n)$. The side information consists of a weight vector for each of the modes which is not orthogonal to any of the latent factors along that mode; this is significantly weaker than assuming noisy knowledge of the subspaces. We provide an algorithm that utilizes this side information to produce a consistent estimator with $O(n^{1+\kappa})$ samples for any small constant $\kappa > 0$. We also provide experiments on both synthetic and real-world datasets that validate our theoretical insights.
翻译:张量补全在实现张量估计所需样本数量方面呈现出有趣的计算-统计差距。尽管一个$t$阶张量具有$n^t$个条目,其自由度仅为$\Theta(tn)$,但当前已知的最佳多项式时间算法需要$O(n^{t/2})$个样本才能保证一致性估计。本文证明,弱侧信息足以将样本复杂度降低至$O(n)$。该侧信息由每个模态的权重向量构成,这些向量不与该模态上的任何潜在因子正交;这显著弱于假设子空间的含噪先验知识。我们提出一种利用该侧信息的算法,该算法能以$O(n^{1+\kappa})$个样本(其中$\kappa > 0$为任意小常数)生成一致性估计量。我们还在合成数据集和真实数据集上进行了实验,验证了理论结论。