We study the tensor-on-tensor regression, where the goal is to connect tensor responses to tensor covariates with a low Tucker rank parameter tensor/matrix without the prior knowledge of its intrinsic rank. We propose the Riemannian gradient descent (RGD) and Riemannian Gauss-Newton (RGN) methods and cope with the challenge of unknown rank by studying the effect of rank over-parameterization. We provide the first convergence guarantee for the general tensor-on-tensor regression by showing that RGD and RGN respectively converge linearly and quadratically to a statistically optimal estimate in both rank correctly-parameterized and over-parameterized settings. Our theory reveals an intriguing phenomenon: Riemannian optimization methods naturally adapt to over-parameterization without modifications to their implementation. We also prove the statistical-computational gap in scalar-on-tensor regression by a direct low-degree polynomial argument. Our theory demonstrates a "blessing of statistical-computational gap" phenomenon: in a wide range of scenarios in tensor-on-tensor regression for tensors of order three or higher, the computationally required sample size matches what is needed by moderate rank over-parameterization when considering computationally feasible estimators, while there are no such benefits in the matrix settings. This shows moderate rank over-parameterization is essentially "cost-free" in terms of sample size in tensor-on-tensor regression of order three or higher. Finally, we conduct simulation studies to show the advantages of our proposed methods and to corroborate our theoretical findings.
翻译:我们研究张量对张量回归,其目标是在未知内蕴秩的情况下,以低Tucker秩参数张量/矩阵连接张量响应与张量协变量。我们提出黎曼梯度下降(RGD)和黎曼高斯-牛顿(RGN)方法,并通过研究秩过参数化的效应来应对未知秩的挑战。我们首次为一般张量对张量回归提供收敛保证,证明RGD和RGN分别在秩正确参数化和过参数化设置中以线性速率和二次速率收敛到统计最优估计。我们的理论揭示了一个有趣现象:黎曼优化方法无需修改其实现即可自然适应过参数化。我们还通过直接的低次多项式论证证明了标量对张量回归中的统计计算差距。我们的理论展示了“统计计算差距的福祉”现象:在三阶或更高阶张量的张量对张量回归的广泛场景中,当考虑计算可行估计量时,计算所需样本量与中等秩过参数化所需样本量相匹配,而在矩阵设置中则无此收益。这表明在三阶或更高阶张量对张量回归中,中等秩过参数化在样本量方面本质上是“无代价的”。最后,我们进行模拟研究以展示所提方法的优势并验证理论结果。