In this work, we consider constrained stochastic optimization problems under hidden convexity, i.e., those that admit a convex reformulation via non-linear (but invertible) map $c(\cdot)$. A number of non-convex problems ranging from optimal control, revenue and inventory management, to convex reinforcement learning all admit such a hidden convex structure. Unfortunately, in the majority of applications considered, the map $c(\cdot)$ is unavailable or implicit; therefore, directly solving the convex reformulation is not possible. On the other hand, the stochastic gradients with respect to the original variable are often easy to obtain. Motivated by these observations, we examine the basic projected stochastic (sub-) gradient methods for solving such problems under hidden convexity. We provide the first sample complexity guarantees for global convergence in smooth and non-smooth settings. Additionally, in the smooth setting, we improve our results to the last iterate convergence in terms of function value gap using the momentum variant of projected stochastic gradient descent.
翻译:本文研究隐凸性下的约束随机优化问题,即那些通过非线性(但可逆)映射$c(\cdot)$可转化为凸形式的问题。从最优控制、收益与库存管理到凸强化学习等一系列非凸问题均具有此类隐凸结构。然而,在大多数应用场景中,映射$c(\cdot)$难以获取或呈隐式表达,因此无法直接求解凸重构问题。另一方面,针对原始变量的随机梯度通常易于获得。基于这些观察,我们研究了利用基础投影随机(次)梯度方法求解此类隐凸问题的性能。我们首次在光滑与非光滑设定下给出了全局收敛的样本复杂度保证。此外,在光滑设定中,我们通过采用带动量的投影随机梯度下降变体,将函数值间隙的收敛性结果改进至最终迭代点。