In large-scale data processing scenarios, data often arrive in sequential streams generated by complex systems that exhibit drifting distributions and time-varying system parameters. This nonstationarity challenges theoretical analysis, as it violates classical assumptions of i.i.d. (independent and identically distributed) samples, necessitating algorithms capable of real-time updates without expensive retraining. An effective approach should process each sample in a single pass, while maintaining computational and memory complexities independent of the data stream length. Motivated by these challenges, this paper investigates the Momentum Least Mean Squares (MLMS) algorithm as an adaptive identification tool, leveraging its computational simplicity and online processing capabilities. Theoretically, we derive tracking performance and regret bounds for the MLMS in time-varying stochastic linear systems under various practical conditions. Unlike classical LMS, whose stability can be characterized by first-order random vector difference equations, MLMS introduces an additional dynamical state due to momentum, leading to second-order time-varying random vector difference equations whose stability analysis hinges on more complicated products of random matrices, which poses a substantially challenging problem to resolve. Experiments on synthetic and real-world data streams demonstrate that MLMS achieves rapid adaptation and robust tracking, in agreement with our theoretical results especially in nonstationary settings, highlighting its promise for modern streaming and online learning applications.
翻译:在大规模数据处理场景中,数据通常以顺序流的形式到达,这些数据流由具有分布漂移和时变系统参数的复杂系统生成。这种非平稳性对理论分析提出了挑战,因为它违背了独立同分布样本的经典假设,从而需要能够在不进行昂贵重新训练的情况下实现实时更新的算法。一个有效的方法应当能够单次处理每个样本,同时保持与数据流长度无关的计算和内存复杂度。受这些挑战的驱动,本文研究了动量最小均方算法作为一种自适应辨识工具,利用其计算简洁性和在线处理能力。在理论上,我们在多种实际条件下推导了MLMS在时变随机线性系统中的跟踪性能和遗憾界。与经典LMS不同(其稳定性可通过一阶随机向量差分方程刻画),MLMS由于动量的引入而增加了一个动态状态,导致需要分析二阶时变随机向量差分方程,其稳定性分析依赖于更复杂的随机矩阵乘积,这构成了一个极具挑战性的待解问题。在合成和真实世界数据流上的实验表明,MLMS实现了快速适应和鲁棒跟踪,这与我们的理论结果(尤其是在非平稳环境中)一致,突显了其在现代流式与在线学习应用中的潜力。