We analyze the oracle complexity of the stochastic Halpern iteration with variance reduction, where we aim to approximate fixed-points of nonexpansive and contractive operators in a normed finite-dimensional space. We show that if the underlying stochastic oracle is with uniformly bounded variance, our method exhibits an overall oracle complexity of $\tilde{O}(\varepsilon^{-5})$, improving recent rates established for the stochastic Krasnoselskii-Mann iteration. Also, we establish a lower bound of $\Omega(\varepsilon^{-3})$, which applies to a wide range of algorithms, including all averaged iterations even with minibatching. Using a suitable modification of our approach, we derive a $O(\varepsilon^{-2}(1-\gamma)^{-3})$ complexity bound in the case in which the operator is a $\gamma$-contraction. As an application, we propose new synchronous algorithms for average reward and discounted reward Markov decision processes. In particular, for the average reward, our method improves on the best-known sample complexity.
翻译:我们分析了带方差缩减的随机Halpern迭代的预言复杂度,目标是在有限维赋范空间中逼近非扩张算子和压缩算子的不动点。我们证明,若底层随机预言的方差一致有界,则我们的方法具有$\tilde{O}(\varepsilon^{-5})$的整体预言复杂度,改进了近期为随机Krasnoselskii-Mann迭代建立的界。同时,我们建立了$\Omega(\varepsilon^{-3})$的下界,该下界适用于广泛算法,包括所有带小批量处理的平均迭代。通过对方法进行适当修改,我们在算子为$\gamma$-压缩的情形下推导出$O(\varepsilon^{-2}(1-\gamma)^{-3})$的复杂度界。作为应用,我们为平均奖励和折扣奖励马尔可夫决策过程提出了新的同步算法。特别地,对于平均奖励,我们的方法改进了已知最佳的样本复杂度。