Particle gradient descent, which uses particles to represent a probability measure and performs gradient descent on particles in parallel, is widely used to optimize functions of probability measures. This paper considers particle gradient descent with a finite number of particles and establishes its theoretical guarantees to optimize functions that are \emph{displacement convex} in measures. Concretely, for Lipschitz displacement convex functions defined on probability over $\mathbb{R}^d$, we prove that $O(1/\epsilon^2)$ particles and $O(d/\epsilon^4)$ computations are sufficient to find the $\epsilon$-optimal solutions. We further provide improved complexity bounds for optimizing smooth displacement convex functions. We demonstrate the application of our results for function approximation with specific neural architectures with two-dimensional inputs.
翻译:粒子梯度下降(Particle Gradient Descent)利用粒子表示概率测度,并行地对粒子进行梯度下降,广泛用于优化概率测度函数。本文研究有限粒子数情况下的粒子梯度下降,并为其在优化位移凸(displacement convex)函数时建立理论保证。具体而言,对于定义在$\mathbb{R}^d$上概率测度空间中的Lipschitz位移凸函数,我们证明仅需$O(1/\epsilon^2)$个粒子和$O(d/\epsilon^4)$次计算即可找到$\epsilon$-最优解。此外,针对光滑位移凸函数的优化,我们进一步给出了改进的复杂度上界。最后,我们展示了相关结果在具有二维输入的特定神经网络架构中进行函数逼近的应用。