We study the statistical inference of nonlinear stochastic approximation algorithms utilizing a single trajectory of Markovian data. Our methodology has practical applications in various scenarios, such as Stochastic Gradient Descent (SGD) on autoregressive data and asynchronous Q-Learning. By utilizing the standard stochastic approximation (SA) framework to estimate the target parameter, we establish a functional central limit theorem for its partial-sum process, $\boldsymbol{\phi}_T$. To further support this theory, we provide a matching semiparametric efficient lower bound and a non-asymptotic upper bound on its weak convergence, measured in the L\'evy-Prokhorov metric. This functional central limit theorem forms the basis for our inference method. By selecting any continuous scale-invariant functional $f$, the asymptotic pivotal statistic $f(\boldsymbol{\phi}_T)$ becomes accessible, allowing us to construct an asymptotically valid confidence interval. We analyze the rejection probability of a family of functionals $f_m$, indexed by $m \in \mathbb{N}$, through theoretical and numerical means. The simulation results demonstrate the validity and efficiency of our method.
翻译:研究利用单个马尔可夫数据轨迹的非线性随机逼近算法的统计推断。本方法在多种实际场景中具有应用价值,例如自回归数据上的随机梯度下降(SGD)和异步Q学习。通过采用标准随机逼近(SA)框架估计目标参数,我们建立了其部分和过程$\boldsymbol{\phi}_T$的函数中心极限定理。为支撑该理论,我们给出了匹配的半参数有效下界,以及以Lévy-Prokhorov度量衡量的弱收敛性的非渐近上界。该函数中心极限定理构成了推断方法的基础。通过选取任意连续尺度不变泛函$f$,渐近枢轴统计量$f(\boldsymbol{\phi}_T)$变得可解,使我们能构建渐近有效的置信区间。我们通过理论与数值手段分析了由$m \in \mathbb{N}$索引的泛函族$f_m$的拒绝概率。仿真结果验证了本方法的有效性与效率。