We propose in this paper, STANLEY, a STochastic gradient ANisotropic LangEvin dYnamics, for sampling high dimensional data. With the growing efficacy and potential of Energy-Based modeling, also known as non-normalized probabilistic modeling, for modeling a generative process of different natures of high dimensional data observations, we present an end-to-end learning algorithm for Energy-Based models (EBM) with the purpose of improving the quality of the resulting sampled data points. While the unknown normalizing constant of EBMs makes the training procedure intractable, resorting to Markov Chain Monte Carlo (MCMC) is in general a viable option. Realizing what MCMC entails for the EBM training, we propose in this paper, a novel high dimensional sampling method, based on an anisotropic stepsize and a gradient-informed covariance matrix, embedded into a discretized Langevin diffusion. We motivate the necessity for an anisotropic update of the negative samples in the Markov Chain by the nonlinearity of the backbone of the EBM, here a Convolutional Neural Network. Our resulting method, namely STANLEY, is an optimization algorithm for training Energy-Based models via our newly introduced MCMC method. We provide a theoretical understanding of our sampling scheme by proving that the sampler leads to a geometrically uniformly ergodic Markov Chain. Several image generation experiments are provided in our paper to show the effectiveness of our method.
翻译:本文提出了一种名为STANLEY(随机梯度各向异性朗之万动力学)的方法,用于对高维数据进行采样。随着能量建模(即非归一化概率建模)在模拟各种高维数据观测的生成过程中展现出日益增强的效能与潜力,我们提出了一种端到端的能量模型(EBM)学习算法,旨在提升生成数据点的质量。尽管EBM中的未知归一化常数导致训练过程变得棘手,但采用马尔可夫链蒙特卡洛(MCMC)方法通常是一种可行的方案。基于对MCMC在EBM训练中作用的深入理解,本文提出了一种新颖的高维采样方法——该方法基于各向异性步长和梯度感知协方差矩阵,并嵌入离散化朗之万扩散中。我们通过EBM主干网络(此处为卷积神经网络)的非线性特性,论证了在马尔可夫链中对负样本进行各向异性更新的必要性。所提出的STANLEY方法是一种通过我们新引入的MCMC方法优化训练能量模型的算法。我们从理论上证明了该采样方案生成的马尔可夫链具有几何均匀遍历性质。本文通过多项图像生成实验验证了该方法的有效性。