Recent technological advances have led to contemporary applications that demand real-time processing and analysis of sequentially arriving tensor data. Traditional offline learning, involving the storage and utilization of all data in each computational iteration, becomes impractical for high-dimensional tensor data due to its voluminous size. Furthermore, existing low-rank tensor methods lack the capability for statistical inference in an online fashion, which is essential for real-time predictions and informed decision-making. This paper addresses these challenges by introducing a novel online inference framework for low-rank tensor learning. Our approach employs Stochastic Gradient Descent (SGD) to enable efficient real-time data processing without extensive memory requirements, thereby significantly reducing computational demands. We establish a non-asymptotic convergence result for the online low-rank SGD estimator, nearly matches the minimax optimal rate of estimation error in offline models that store all historical data. Building upon this foundation, we propose a simple yet powerful online debiasing approach for sequential statistical inference in low-rank tensor learning. The entire online procedure, covering both estimation and inference, eliminates the need for data splitting or storing historical data, making it suitable for on-the-fly hypothesis testing. Given the sequential nature of our data collection, traditional analyses relying on offline methods and sample splitting are inadequate. In our analysis, we control the sum of constructed super-martingales to ensure estimates along the entire solution path remain within the benign region. Additionally, a novel spectral representation tool is employed to address statistical dependencies among iterative estimates, establishing the desired asymptotic normality.
翻译:近年来技术进步催生了诸多需要实时处理和分析顺序到达的张量数据的现代应用。传统离线学习需要存储并在每次计算迭代中使用全部数据,但由于高维张量数据体积庞大,这种方法已不切实际。此外,现有低秩张量方法缺乏在线统计推断能力,而这是实现实时预测和明智决策的关键。本文针对上述挑战,提出了一种适用于低秩张量学习的新型在线推断框架。该方法采用随机梯度下降法实现高效实时数据处理,无需大量内存需求,显著降低了计算负担。我们证明了在线低秩随机梯度下降估计量的非渐近收敛性,其收敛速度几乎匹配存储全部历史数据的离线模型所达到的极小化最优误差率。在此基础之上,我们提出了一种简洁而强大的在线去偏方法,用于低秩张量学习中的序贯统计推断。整个在线流程涵盖估计与推断两个阶段,无需数据拆分或存储历史数据,适用于实时假设检验。鉴于数据收集具有序贯特性,依赖离线方法与样本拆分的传统分析已不适用。在理论分析中,我们通过控制构造的超鞅之和,确保整个解路径上的估计值始终保持在良性区域。此外,我们采用一种新颖的谱表示工具来处理迭代估计量之间的统计依赖性,从而建立所需的渐近正态性。