We consider the block coordinate descent methods of Gauss-Seidel type with proximal regularization (BCD-PR), which is a classical method of minimizing general nonconvex objectives under constraints that has a wide range of practical applications. We theoretically establish the worst-case complexity bound for this algorithm. Namely, we show that for general nonconvex smooth objectives with block-wise constraints, the classical BCD-PR algorithm converges to an epsilon-stationary point within O(1/epsilon) iterations. Under a mild condition, this result still holds even if the algorithm is executed inexactly in each step. As an application, we propose a provable and efficient algorithm for `Wasserstein CP-dictionary learning', which seeks a set of elementary probability distributions that can well-approximate a given set of d-dimensional joint probability distributions. Our algorithm is a version of BCD-PR that operates in the dual space, where the primal problem is regularized both entropically and proximally.
翻译:本文考虑带有近邻正则化的高斯-赛德尔型块坐标下降法(BCD-PR),这是一种在约束条件下最小化一般非凸目标的经典方法,具有广泛的实践应用。我们从理论上建立了该算法的最坏情况复杂度界。具体地,我们证明:对于具有块约束的一般非凸光滑目标函数,经典BCD-PR算法在O(1/ε)次迭代内收敛到ε-稳定点。在温和条件下,即使算法每步非精确执行,该结论仍然成立。作为应用,我们提出了一种可证明且高效的算法用于'Wasserstein CP-字典学习',该任务旨在寻找一组能够良好逼近给定d维联合概率分布集合的基概率分布。我们的算法是工作在偶空间中的BCD-PR变体,其中原始问题同时受到熵正则化和近邻正则化。