Two-time-scale stochastic approximation algorithms are iterative methods used in applications such as optimization, reinforcement learning, and control. Finite-time analysis of these algorithms has primarily focused on fixed point iterations where both time-scales have contractive mappings. In this work, we broaden the scope of such analyses by considering settings where the slower time-scale has a non-expansive mapping. For such algorithms, the slower time-scale can be viewed as a stochastic inexact Krasnoselskii-Mann iteration. We also study a variant where the faster time-scale has a projection step which leads to non-expansiveness in the slower time-scale. We show that the last-iterate mean square residual error for such algorithms decays at a rate $O(1/k^{1/4-ε})$, where $ε>0$ is arbitrarily small. We further establish almost sure convergence of iterates to the set of fixed points. We demonstrate the applicability of our framework by applying our results to minimax optimization, linear stochastic approximation, and Lagrangian optimization.
翻译:双时间尺度随机逼近算法是用于优化、强化学习和控制等领域的迭代方法。这些算法的有限时间分析主要集中于两个时间尺度均具有压缩映射的不动点迭代。在本工作中,我们通过考虑较慢时间尺度具有非扩张映射的情形,扩展了此类分析的范围。对于此类算法,较慢时间尺度可视为随机不精确Krasnoselskii-Mann迭代。我们还研究了一种变体,其中较快时间尺度包含投影步骤,导致较慢时间尺度呈现非扩张性。我们证明此类算法的末次迭代均方残差误差以$O(1/k^{1/4-ε})$速率衰减,其中$ε>0$为任意小正数。我们进一步建立了迭代几乎必然收敛到不动点集合的结果。通过将我们的结果应用于极小极大优化、线性随机逼近和拉格朗日优化,我们展示了该框架的适用性。