Transport-based density estimation methods are receiving growing interest because of their ability to efficiently generate samples from the approximated density. We further invertigate the sequential transport maps framework proposed from arXiv:2106.04170 arXiv:2303.02554, which builds on a sequence of composed Knothe-Rosenblatt (KR) maps. Each of those maps are built by first estimating an intermediate density of moderate complexity, and then by computing the exact KR map from a reference density to the precomputed approximate density. In our work, we explore the use of Sum-of-Squares (SoS) densities and $\alpha$-divergences for approximating the intermediate densities. Combining SoS densities with $\alpha$-divergence interestingly yields convex optimization problems which can be efficiently solved using semidefinite programming. The main advantage of $\alpha$-divergences is to enable working with unnormalized densities, which provides benefits both numerically and theoretically. In particular, we provide two new convergence analyses of the sequential transport maps: one based on a triangle-like inequality and the second on information geometric properties of $\alpha$-divergences for unnormalizied densities. The choice of intermediate densities is also crucial for the efficiency of the method. While tempered (or annealed) densities are the state-of-the-art, we introduce diffusion-based intermediate densities which permits to approximate densities known from samples only. Such intermediate densities are well-established in machine learning for generative modeling. Finally we propose and try different low-dimensional maps (or lazy maps) for dealing with high-dimensional problems and numerically demonstrate our methods on several benchmarks, including Bayesian inference problems and unsupervised learning task.
翻译:基于传输的密度估计方法因其能够高效地从近似密度中生成样本而受到越来越多的关注。我们进一步研究了arXiv:2106.04170和arXiv:2303.02554中提出的序列传输映射框架,该框架基于一系列复合Knothe-Rosenblatt(KR)映射构建。每个KR映射通过首先估计一个中等复杂度的中间密度,然后计算从参考密度到预先计算的近似密度的精确KR映射来构建。在我们的工作中,我们探讨了使用平方和(SoS)密度和α-散度来近似中间密度。有趣的是,将SoS密度与α-散度相结合会产生凸优化问题,这些问题可以通过半正定规划高效求解。α-散度的主要优势在于能够处理未归一化的密度,这在数值和理论上都具有优势。特别地,我们提供了序列传输映射的两个新的收敛性分析:一个基于类三角不等式,另一个基于未归一化密度的α-散度的信息几何性质。中间密度的选择对于方法的效率也至关重要。虽然温度(或退火)密度是当前最先进的方法,但我们引入了基于扩散的中间密度,这使得能够仅从样本中近似密度。这类中间密度在机器学习中用于生成建模已有成熟应用。最后,我们提出并尝试了不同的低维映射(或惰性映射)来处理高维问题,并在多个基准测试(包括贝叶斯推断问题和无监督学习任务)上数值验证了我们的方法。