Let $V_* : \mathbb{R}^d \to \mathbb{R}$ be some (possibly non-convex) potential function, and consider the probability measure $\pi \propto e^{-V_*}$. When $\pi$ exhibits multiple modes, it is known that sampling techniques based on Wasserstein gradient flows of the Kullback-Leibler (KL) divergence (e.g. Langevin Monte Carlo) suffer poorly in the rate of convergence, where the dynamics are unable to easily traverse between modes. In stark contrast, the work of Lu et al. (2019; 2022) has shown that the gradient flow of the KL with respect to the Fisher-Rao (FR) geometry exhibits a convergence rate to $\pi$ is that \textit{independent} of the potential function. In this short note, we complement these existing results in the literature by providing an explicit expansion of $\text{KL}(\rho_t^{\text{FR}}\|\pi)$ in terms of $e^{-t}$, where $(\rho_t^{\text{FR}})_{t\geq 0}$ is the FR gradient flow of the KL divergence. In turn, we are able to provide a clean asymptotic convergence rate, where the burn-in time is guaranteed to be finite. Our proof is based on observing a similarity between FR gradient flows and simulated annealing with linear scaling, and facts about cumulant generating functions. We conclude with simple synthetic experiments that demonstrate our theoretical findings are indeed tight. Based on our numerics, we conjecture that the asymptotic rates of convergence for Wasserstein-Fisher-Rao gradient flows are possibly related to this expansion in some cases.
翻译:设 $V_* : \mathbb{R}^d \to \mathbb{R}$ 为某个(可能非凸)势函数,并考虑概率测度 $\pi \propto e^{-V_*}$。当 $\pi$ 呈现多模态时,基于Kullback-Leibler (KL)散度的Wasserstein梯度流采样技术(例如Langevin Monte Carlo)的收敛速率表现不佳,因为动力学难以在模态之间穿越。与此形成鲜明对比的是,Lu等人(2019; 2022)的研究表明:在Fisher-Rao (FR)几何下,KL散度的梯度流对 $\pi$ 的收敛速率与势函数无关。在本短文中,我们通过提供KL散度的FR梯度流 $(\rho_t^{\text{FR}})_{t\geq 0}$ 基于 $e^{-t}$ 的显式展开,补充了文献中的这些现有结果。进而,我们得到了清晰的渐近收敛速率,其中预热时间被保证为有限值。我们的证明基于观察FR梯度流与线性尺度模拟退火之间的相似性,以及累积生成函数的性质。我们通过简单的合成实验展示了理论结果的紧致性。基于数值结果,我们推测在某些情况下,Wasserstein-Fisher-Rao梯度流的渐近收敛速率可能与这种展开相关。