We consider extensions of the Newton-MR algorithm for nonconvex optimization to the settings where Hessian information is approximated. Under additive noise model on the Hessian matrix, we investigate the iteration and operation complexities of these variants to achieve first and second-order sub-optimality criteria. We show that, under certain conditions, the algorithms achieve iteration and operation complexities that match those of the exact variant. Focusing on the particular nonconvex problems satisfying Polyak-\L ojasiewicz condition, we show that our algorithm achieves a linear convergence rate. We finally compare the performance of our algorithms with several alternatives on a few machine learning problems.
翻译:我们考虑了将牛顿-MR算法扩展到非凸优化问题中Hessian信息被近似的情况。在Hessian矩阵的加性噪声模型下,我们研究了这些变体达到一阶和二阶次优性准则的迭代复杂度和操作复杂度。我们证明,在特定条件下,这些算法能够达到与精确变体相匹配的迭代复杂度和操作复杂度。针对满足Polyak-\L ojasiewicz条件的特定非凸问题,我们展示了算法具有线性收敛速率。最后,我们在若干机器学习问题上将所提算法的性能与几种替代算法进行了比较。