Latent-variable energy-based models (LVEBMs) assign a single normalized energy to joint pairs of observed data and latent variables, offering expressive generative modeling while capturing hidden structure. We recast maximum-likelihood training as a saddle problem over distributions on the latent and joint manifolds and view the inner updates as coupled Wasserstein gradient flows. The resulting algorithm alternates overdamped Langevin updates for a joint negative pool and for conditional latent particles with stochastic parameter ascent, requiring no discriminator or auxiliary networks. We prove existence and convergence under standard smoothness and dissipativity assumptions, with decay rates in KL divergence and Wasserstein-2 distance. The saddle-point view further yields an ELBO strictly tighter than bounds obtained with restricted amortized posteriors. Our method is evaluated on numerical approximations of physical systems and performs competitively against comparable approaches.
翻译:潜变量能量基模型为观测数据与潜变量的联合对分配单一归一化能量,在捕捉隐藏结构的同时提供富有表现力的生成建模能力。我们将最大似然训练重新表述为潜变量流形与联合流形上分布的鞍点问题,并将内部更新视为耦合的Wasserstein梯度流。所得算法交替执行联合负池与条件潜变量粒子的过阻尼朗之万更新,并结合随机参数上升法,无需判别器或辅助网络。我们在标准光滑性与耗散性假设下证明了算法的存在性与收敛性,并给出KL散度与Wasserstein-2距离的衰减速率。该鞍点视角进一步导出了比受限摊销后验所得界限更严格的证据下界。我们在物理系统的数值近似任务上评估了本方法,其性能与同类方法相比具有竞争力。