Motivated by the desire to understand stochastic algorithms for nonconvex optimization that are robust to their hyperparameter choices, we analyze a mini-batched prox-linear iterative algorithm for the problem of recovering an unknown rank-1 matrix from rank-1 Gaussian measurements corrupted by noise. We derive a deterministic recursion that predicts the error of this method and show, using a non-asymptotic framework, that this prediction is accurate for any batch-size and a large range of step-sizes. In particular, our analysis reveals that this method, though stochastic, converges linearly from a local initialization with a fixed step-size to a statistical error floor. Our analysis also exposes how the batch-size, step-size, and noise level affect the (linear) convergence rate and the eventual statistical estimation error, and we demonstrate how to use our deterministic predictions to perform hyperparameter tuning (e.g. step-size and batch-size selection) without ever running the method. On a technical level, our analysis is enabled in part by showing that the fluctuations of the empirical iterates around our deterministic predictions scale with the error of the previous iterate.
翻译:受理解非凸优化中随机算法对超参数选择鲁棒性这一动机的驱动,我们针对从受噪声干扰的秩1高斯测量中恢复未知秩1矩阵的问题,分析了一种小批量邻近线性迭代算法。我们推导出一个确定性递归关系来预测该方法的误差,并利用非渐近框架证明:对于任意批大小和广泛的步长范围,该预测是准确的。特别地,我们的分析揭示该方法虽为随机算法,但从局部初始化出发以固定步长线性收敛至统计误差下限。我们的分析还揭示了批大小、步长和噪声水平如何影响(线性)收敛速度及最终统计估计误差,并展示了如何在不实际运行该方法的情况下,利用确定性预测进行超参数调优(例如步长和批大小选择)。在技术层面,我们的分析部分得益于证明:经验迭代值围绕确定性预测的波动幅度与上一次迭代的误差呈正比关系。