We consider minimization of a smooth nonconvex function with inexact oracle access to gradient and Hessian (without assuming access to the function value) to achieve approximate second-order optimality. A novel feature of our method is that if an approximate direction of negative curvature is chosen as the step, we choose its sense to be positive or negative with equal probability. We allow gradients to be inexact in a relative sense and relax the coupling between inexactness thresholds for the first- and second-order optimality conditions. Our convergence analysis includes both an expectation bound based on martingale analysis and a high-probability bound based on concentration inequalities. We apply our algorithm to empirical risk minimization problems and obtain improved gradient sample complexity over existing works.
翻译:我们考虑具有不精确梯度和Hessian矩阵(不假设可获取函数值)的平滑非凸函数极小化问题,目标达到近似二阶最优性。该方法的一个新特点是:若选择近似负曲率方向作为步长,则以等概率随机选择其正负方向。我们允许梯度在相对意义下不精确,并放宽一阶与二阶最优性条件中不精确度阈值间的耦合关系。收敛分析包含基于鞅分析的期望界以及基于浓度不等式的高概率界。将该算法应用于经验风险极小化问题,获得了优于现有工作的梯度样本复杂度。