The problem of optimising functions with intractable gradients frequently arise in machine learning and statistics, ranging from maximum marginal likelihood estimation procedures to fine-tuning of generative models. Stochastic approximation methods for this class of problems typically require inner sampling loops to obtain (biased) stochastic gradient estimates, which rapidly becomes computationally expensive. In this work, we develop sequential Monte Carlo (SMC) samplers for optimisation of functions with intractable gradients. Our approach replaces expensive inner sampling methods with efficient SMC approximations, which can result in significant computational gains. We establish convergence results for the basic recursions defined by our methodology which SMC samplers approximate. We demonstrate the effectiveness of our approach on the reward-tuning of energy-based models within various settings.
翻译:在机器学习和统计学中,函数梯度难以处理的最优化问题频繁出现,其范围涵盖最大边际似然估计过程至生成模型的微调。针对此类问题的随机逼近方法通常需要内层采样循环以获得(有偏的)随机梯度估计,这会迅速导致计算成本高昂。本文针对梯度难以处理的函数优化问题,开发了序贯蒙特卡洛采样器。该方法以高效的SMC近似替代昂贵的内层采样方法,从而可能带来显著的计算效率提升。我们为该方法所定义且由SMC采样器近似的基本递归过程建立了收敛性结果。我们在多种设定下,通过基于能量模型的奖励调优任务验证了所提方法的有效性。