Despite the considerable success of Bregman proximal-type algorithms, such as mirror descent, in machine learning, a critical question remains: Can existing stationarity measures, often based on Bregman divergence, reliably distinguish between stationary and non-stationary points? In this paper, we present a groundbreaking finding: All existing stationarity measures necessarily imply the existence of spurious stationary points. We further establish an algorithmic independent hardness result: Bregman proximal-type algorithms are unable to escape from a spurious stationary point in finite steps when the initial point is unfavorable, even for convex problems. Our hardness result points out the inherent distinction between Euclidean and Bregman geometries, and introduces both fundamental theoretical and numerical challenges to both machine learning and optimization communities.
翻译:尽管Bregman近端类算法(如镜像下降)在机器学习中取得了显著成功,但一个关键问题仍然存在:现有的基于Bregman散度的平稳性度量能否可靠地区分平稳点与非平稳点?本文提出了一项开创性发现:所有现有的平稳性度量必然蕴含虚假平稳点的存在。我们进一步建立了一项与算法无关的困难性结果:当初始点不利时,即使对于凸问题,Bregman近端类算法也无法在有限步之内逃离虚假平稳点。该困难性结果揭示了欧几里得几何与Bregman几何之间的本质区别,并为机器学习和优化领域带来了基础性的理论与数值挑战。