Contemporary applications, such as recommendation systems and mobile health monitoring, require real-time processing and analysis of sequentially arriving high-dimensional tensor data. Traditional offline learning, involving the storage and utilization of all data in each computational iteration, becomes impractical for these tasks. Furthermore, existing low-rank tensor methods lack the capability for online statistical inference, which is essential for real-time predictions and informed decision-making. This paper addresses these challenges by introducing a novel online inference framework for low-rank tensors. Our approach employs Stochastic Gradient Descent (SGD) to enable efficient real-time data processing without extensive memory requirements. We establish a non-asymptotic convergence result for the online low-rank SGD estimator, nearly matches the minimax optimal estimation error rate of offline models. Furthermore, we propose a simple yet powerful online debiasing approach for sequential statistical inference. The entire online procedure, covering both estimation and inference, eliminates the need for data splitting or storing historical data, making it suitable for on-the-fly hypothesis testing. In our analysis, we control the sum of constructed super-martingales to ensure estimates along the entire solution path remain within the benign region. Additionally, a novel spectral representation tool is employed to address statistical dependencies among iterative estimates, establishing the desired asymptotic normality.
翻译:当代应用,如推荐系统和移动健康监测,需要对顺序到达的高维张量数据进行实时处理与分析。传统的离线学习方法需要在每次计算迭代中存储并利用全部数据,对于此类任务已变得不切实际。此外,现有的低秩张量方法缺乏在线统计推断能力,而这对于实时预测和知情决策至关重要。本文通过引入一种新颖的低秩张量在线推断框架来解决这些挑战。我们的方法采用随机梯度下降法,以实现高效的实时数据处理,且无需大量内存需求。我们为在线低秩SGD估计器建立了一个非渐近收敛结果,该结果几乎匹配离线模型的最小最大最优估计误差率。此外,我们提出了一种简单而强大的在线去偏方法,用于序列统计推断。整个在线流程,涵盖估计与推断,无需数据分割或存储历史数据,使其适用于即时假设检验。在我们的分析中,我们通过控制所构造的超鞅之和,确保整个解路径上的估计值保持在良性区域内。此外,采用了一种新颖的谱表示工具来处理迭代估计值之间的统计依赖性,从而建立了所需的渐近正态性。