In deep learning, the recently introduced state space models utilize HiPPO (High-order Polynomial Projection Operators) memory units to approximate continuous-time trajectories of input functions using ordinary differential equations (ODEs), and these techniques have shown empirical success in capturing long-range dependencies in long input sequences. However, the mathematical foundations of these ODEs, particularly the singular HiPPO-LegS (Legendre Scaled) ODE, and their corresponding numerical discretizations remain unexplored. In this work, we fill this gap by establishing that HiPPO-LegS ODE is well-posed despite its singularity, albeit without the freedom of arbitrary initial conditions, and by establishing convergence of the associated numerical discretization schemes for Riemann-integrable input functions.
翻译:在深度学习中,近期提出的状态空间模型采用HiPPO(高阶多项式投影算子)记忆单元,通过常微分方程逼近输入函数的连续时间轨迹,这些技术在捕获长输入序列中的长程依赖关系方面已展现出实证效果。然而,这些常微分方程——尤其是奇异的HiPPO-LegS(缩放勒让德)常微分方程——及其对应的数值离散化方法的数学基础仍未得到探索。本研究通过证明HiPPO-LegS常微分方程虽具有奇异性且初始条件不可任意选取,但仍是适定的,并针对黎曼可积输入函数建立了相关数值离散化格式的收敛性,从而填补了这一空白。