Sequential transport maps using SoS density estimation and $α$-divergences

Transport-based density estimation methods are receiving growing interest because of their ability to efficiently generate samples from the approximated density. We further invertigate the sequential transport maps framework proposed from arXiv:2106.04170 arXiv:2303.02554, which builds on a sequence of composed Knothe-Rosenblatt (KR) maps. Each of those maps are built by first estimating an intermediate density of moderate complexity, and then by computing the exact KR map from a reference density to the precomputed approximate density. In our work, we explore the use of Sum-of-Squares (SoS) densities and $\alpha$-divergences for approximating the intermediate densities. Combining SoS densities with $\alpha$-divergence interestingly yields convex optimization problems which can be efficiently solved using semidefinite programming. The main advantage of $\alpha$-divergences is to enable working with unnormalized densities, which provides benefits both numerically and theoretically. In particular, we provide a new convergence analyses of the sequential transport maps based on information geometric properties of $\alpha$-divergences. The choice of intermediate densities is also crucial for the efficiency of the method. While tempered (or annealed) densities are the state-of-the-art, we introduce diffusion-based intermediate densities which permits to approximate densities known from samples only. Such intermediate densities are well-established in machine learning for generative modeling. Finally we propose low-dimensional maps (or lazy maps) for dealing with high-dimensional problems and numerically demonstrate our methods on Bayesian inference problems and unsupervised learning tasks.

翻译：基于传输的密度估计方法因其能够高效地从近似密度中生成样本而受到日益关注。我们进一步研究了arXiv:2106.04170与arXiv:2303.02554提出的序列化传输映射框架，该框架建立在由一系列复合Knothe-Rosenblatt（KR）映射构成的序列之上。每个映射的构建首先通过估计一个中等复杂度的中间密度，随后计算从参考密度到预计算近似密度的精确KR映射。在本研究中，我们探索了使用平方和（SoS）密度与α-散度来近似中间密度的方法。将SoS密度与α-散度相结合，有趣地产生了可通过半定规划高效求解的凸优化问题。α-散度的主要优势在于能够处理未归一化的密度，这在数值计算与理论分析上均带来益处。特别地，我们基于α-散度的信息几何特性，提出了序列化传输映射的新收敛性分析。中间密度的选择对方法的效率至关重要。虽然回火（或退火）密度是当前的主流方法，我们引入了基于扩散的中间密度，该密度能够仅通过样本近似已知密度。此类中间密度在机器学习的生成建模中已有成熟应用。最后，我们提出了用于处理高维问题的低维映射（或称惰性映射），并在贝叶斯推断问题与无监督学习任务中进行了数值验证。