Information-Geometric Optimization (IGO) provides a unified framework for black-box optimization by interpreting the adaptation of a search distribution as a natural gradient update. Despite its conceptual importance, the convergence theory of IGO remains limited: most existing results concern continuous-time idealizations such as the IGO flow, rather than discrete-time updates with non-infinitesimal learning rates. In this paper, we study discrete-time IGO in continuous spaces, formulated as natural gradient updates in the expectation-parameter coordinates of an exponential family. In particular, we analyze IGO over the multivariate Gaussian family on strongly convex quadratic objective functions. Our analysis covers a setting that simultaneously incorporates full covariance adaptation, a fixed positive learning rate, and quantile-based weights. In this setting, we prove that the covariance matrix converges to the zero matrix. We further show that the mean vector converges to the global optimum, provided that the condition number of the appropriately scaled covariance matrix is bounded at sufficiently frequent iterations. These results advance the convergence theory of IGO and help bridge the gap between the mathematical theory of IGO and practical covariance-adaptive search methods such as CMA-ES.
翻译:信息几何优化(IGO)通过将搜索分布的适应过程解释为自然梯度更新,为黑箱优化提供了统一框架。尽管具有重要的概念意义,但IGO的收敛理论仍十分有限:大多数现有结果涉及连续时间理想化模型(如IGO流),而非具有非无限小学习率的离散时间更新。本文研究连续空间中的离散时间IGO,将其表述为指数族期望参数坐标系下的自然梯度更新。我们特别分析了强凸二次目标函数上多元高斯族的IGO。分析涵盖同时包含全协方差适应、固定正学习率以及基于分位数权重的场景。在此设定下,我们证明协方差矩阵收敛于零矩阵,并进一步表明:若适当缩放后的协方差矩阵条件数在足够频繁的迭代步中有界,则均值向量收敛至全局最优点。这些结果推进了IGO的收敛理论,有助于弥合IGO数学理论与CMA-ES等实际协方差自适应搜索方法之间的差距。