We study the statistical inference of nonlinear stochastic approximation algorithms utilizing a single trajectory of Markovian data. Our methodology has practical applications in various scenarios, such as Stochastic Gradient Descent (SGD) on autoregressive data and asynchronous Q-Learning. By utilizing the standard stochastic approximation (SA) framework to estimate the target parameter, we establish a functional central limit theorem for its partial-sum process, $\boldsymbol{\phi}_T$. To further support this theory, we provide a matching semiparametric efficient lower bound and a non-asymptotic upper bound on its weak convergence, measured in the L\'evy-Prokhorov metric. This functional central limit theorem forms the basis for our inference method. By selecting any continuous scale-invariant functional $f$, the asymptotic pivotal statistic $f(\boldsymbol{\phi}_T)$ becomes accessible, allowing us to construct an asymptotically valid confidence interval. We analyze the rejection probability of a family of functionals $f_m$, indexed by $m \in \mathbb{N}$, through theoretical and numerical means. The simulation results demonstrate the validity and efficiency of our method.
翻译:我们研究了利用单条马尔可夫数据轨迹对非线性随机逼近算法进行统计推断的方法。该方法在多种场景中具有实际应用,例如自回归数据上的随机梯度下降(SGD)和异步Q学习。通过采用标准随机逼近框架估计目标参数,我们建立了其部分和过程$\boldsymbol{\phi}_T$的函数中心极限定理。为支撑该理论,我们提供了匹配的半参数有效下界,以及以Lévy-Prokhorov度量的弱收敛性的非渐近上界。该函数中心极限定理构成了我们推断方法的基础。通过选取任意连续尺度不变泛函$f$,渐近枢轴统计量$f(\boldsymbol{\phi}_T)$变得可计算,从而能够构建渐近有效的置信区间。我们通过理论和数值手段分析了由$m \in \mathbb{N}$索引的泛函族$f_m$的拒绝概率。仿真结果验证了我们方法的有效性和效率。