Identifying low-dimensional structure in high-dimensional probability measures is an essential pre-processing step for efficient sampling. We introduce a method for identifying and approximating a target measure $\pi$ as a perturbation of a given reference measure $\mu$ along a few significant directions of $\mathbb{R}^{d}$. The reference measure can be a Gaussian or a nonlinear transformation of a Gaussian, as commonly arising in generative modeling. Our method extends prior work on minimizing majorizations of the Kullback--Leibler divergence to identify optimal approximations within this class of measures. Our main contribution unveils a connection between the \emph{dimensional} logarithmic Sobolev inequality (LSI) and approximations with this ansatz. Specifically, when the target and reference are both Gaussian, we show that minimizing the dimensional LSI is equivalent to minimizing the KL divergence restricted to this ansatz. For general non-Gaussian measures, the dimensional LSI produces majorants that uniformly improve on previous majorants for gradient-based dimension reduction. We further demonstrate the applicability of this analysis to the squared Hellinger distance, where analogous reasoning shows that the dimensional Poincar\'e inequality offers improved bounds.
翻译:识别高维概率测度中的低维结构是实现高效采样的关键预处理步骤。本文提出一种方法,可将目标测度$\pi$识别并近似为给定参考测度$\mu$沿$\mathbb{R}^{d}$空间中若干显著方向的扰动。参考测度可以是高斯分布,也可以是高斯分布的非线性变换(此类变换常见于生成建模)。本方法通过最小化Kullback-Leibler散度的上界函数,扩展了在该测度类别中寻找最优近似的前期研究。我们的核心贡献在于揭示了\emph{维度}对数Sobolev不等式(LSI)与此类近似形式之间的内在联系。具体而言,当目标测度与参考测度均为高斯分布时,我们证明最小化维度LSI等价于在此近似形式约束下最小化KL散度。对于一般的非高斯测度,维度LSI所产生的上界函数能系统性地改进基于梯度的降维方法中已有的上界估计。我们进一步论证该分析框架可推广至平方Hellinger距离,其中维度Poincar\'e不等式通过类似机理提供了更优的界估计。