Block majorization-minimization with diminishing radius for constrained nonsmooth nonconvex optimization

Block majorization-minimization (BMM) is a simple iterative algorithm for constrained nonconvex optimization that sequentially minimizes majorizing surrogates of the objective function in each block while the others are held fixed. BMM entails a large class of optimization algorithms such as block coordinate descent and its proximal-point variant, expectation-minimization, and block projected gradient descent. We first establish that for general constrained nonsmooth nonconvex optimization, BMM with $\rho$-strongly convex and $L_g$-smooth surrogates can produce an $\epsilon$-approximate first-order optimal point within $\widetilde{O}((1+L_g+\rho^{-1})\epsilon^{-2})$ iterations and asymptotically converges to the set of first-order optimal points. Next, we show that BMM combined with trust-region methods with diminishing radius has an improved complexity of $\widetilde{O}((1+L_g) \epsilon^{-2})$, independent of the inverse strong convexity parameter $\rho^{-1}$, allowing improved theoretical and practical performance with `flat' surrogates. Our results hold robustly even when the convex sub-problems are solved as long as the optimality gaps are summable. Central to our analysis is a novel continuous first-order optimality measure, by which we bound the worst-case sub-optimality in each iteration by the first-order improvement the algorithm makes. We apply our general framework to obtain new results on various algorithms such as the celebrated multiplicative update algorithm for nonnegative matrix factorization by Lee and Seung, regularized nonnegative tensor decomposition, and the classical block projected gradient descent algorithm. Lastly, we numerically demonstrate that the additional use of diminishing radius can improve the convergence rate of BMM in many instances.

翻译：块状主优化-最小化（BMM）是一种用于约束非凸优化的简单迭代算法，它通过依次最小化目标函数在各块上的主代理函数（同时固定其他块）来实现优化。BMM涵盖了一大类优化算法，包括块坐标下降及其近端点变体、期望最小化算法以及块投影梯度下降。我们首先证明，对于一般的约束非光滑非凸优化问题，采用$\rho$-强凸且$L_g$-光滑代理函数的BMM算法可在$\widetilde{O}((1+L_g+\rho^{-1})\epsilon^{-2})$次迭代内生成$\epsilon$-近似一阶最优点，并渐近收敛于一阶最优点集合。进一步研究表明，结合递减信赖域策略的BMM算法具有$\widetilde{O}((1+L_g) \epsilon^{-2})$的改进复杂度，该复杂度独立于逆强凸性参数$\rho^{-1}$，使得采用"平坦"代理函数时能获得更优的理论与实用性能。即使凸子问题仅以可求和的最优性间隙近似求解，我们的结论仍然稳健成立。我们分析的核心在于提出了一种新颖的连续一阶最优性度量，通过该度量可用算法每步迭代实现的一阶改进量来界定最坏情况的次优程度。应用该通用框架，我们为多种算法获得了新的理论结果，包括Lee和Seung提出的非负矩阵分解经典乘法更新算法、正则化非负张量分解算法以及传统块投影梯度下降算法。最后，数值实验表明递减信赖域策略的引入能在多数情况下提升BMM算法的收敛速率。