In this paper, we present a novel stochastic normal map-based algorithm ($\mathsf{norM}\text{-}\mathsf{SGD}$) for nonconvex composite-type optimization problems and discuss its convergence properties. Using a time window-based strategy, we first analyze the global convergence behavior of $\mathsf{norM}\text{-}\mathsf{SGD}$ and it is shown that every accumulation point of the generated sequence of iterates $\{\boldsymbol{x}^k\}_k$ corresponds to a stationary point almost surely and in an expectation sense. The obtained results hold under standard assumptions and extend the more limited convergence guarantees of the basic proximal stochastic gradient method. In addition, based on the well-known Kurdyka-{\L}ojasiewicz (KL) analysis framework, we provide novel point-wise convergence results for the iterates $\{\boldsymbol{x}^k\}_k$ and derive convergence rates that depend on the underlying KL exponent $\boldsymbol{\theta}$ and the step size dynamics $\{\alpha_k\}_k$. Specifically, for the popular step size scheme $\alpha_k=\mathcal{O}(1/k^\gamma)$, $\gamma \in (\frac23,1]$, (almost sure) rates of the form $\|\boldsymbol{x}^k-\boldsymbol{x}^*\| = \mathcal{O}(1/k^p)$, $p \in (0,\frac12)$, can be established. The obtained rates are faster than related and existing convergence rates for $\mathsf{SGD}$ and improve on the non-asymptotic complexity bounds for $\mathsf{norM}\text{-}\mathsf{SGD}$.
翻译:本文提出了一种新颖的基于随机法向映射的算法($\mathsf{norM}\text{-}\mathsf{SGD}$),用于处理非凸复合型优化问题,并讨论了其收敛性质。通过采用时间窗策略,我们首先分析了$\mathsf{norM}\text{-}\mathsf{SGD}$的全局收敛行为,并证明生成的迭代序列$\{\boldsymbol{x}^k\}_k$的每个聚点几乎必然且以期望意义对应于一个平稳点。所得结果在标准假设下成立,并扩展了基本近端随机梯度方法的有限收敛保证。此外,基于著名的Kurdyka-Łojasiewicz(KL)分析框架,我们为迭代序列$\{\boldsymbol{x}^k\}_k$提供了新颖的点态收敛结果,并导出了依赖于底层KL指数$\boldsymbol{\theta}$和步长动态$\{\alpha_k\}_k$的收敛速率。具体而言,对于常用的步长方案$\alpha_k=\mathcal{O}(1/k^\gamma)$($\gamma \in (\frac23,1]$),可建立形如$\|\boldsymbol{x}^k-\boldsymbol{x}^*\| = \mathcal{O}(1/k^p)$($p \in (0,\frac12)$)的(几乎必然)速率。所得速率快于$\mathsf{SGD}$的现有相关收敛速率,并改进了$\mathsf{norM}\text{-}\mathsf{SGD}$的非渐近复杂度界。