Continuous normalizing flows are widely used in generative tasks, where a flow network transports from a data distribution $P$ to a normal distribution. A flow model that can transport from $P$ to an arbitrary $Q$, where both $P$ and $Q$ are accessible via finite samples, would be of various application interests, particularly in the recently developed telescoping density ratio estimation (DRE) which calls for the construction of intermediate densities to bridge between $P$ and $Q$. In this work, we propose such a ``Q-malizing flow'' by a neural-ODE model which is trained to transport invertibly from $P$ to $Q$ (and vice versa) from empirical samples and is regularized by minimizing the transport cost. The trained flow model allows us to perform infinitesimal DRE along the time-parametrized $\log$-density by training an additional continuous-time flow network using classification loss, which estimates the time-partial derivative of the $\log$-density. Integrating the time-score network along time provides a telescopic DRE between $P$ and $Q$ that is more stable than a one-step DRE. The effectiveness of the proposed model is empirically demonstrated on mutual information estimation from high-dimensional data and energy-based generative models of image data.
翻译:连续归一化流广泛应用于生成任务,其中流网络将数据分布 $P$ 映射到正态分布。若流模型能将 $P$ 映射到任意分布 $Q$(两者均可通过有限样本获取),则具有多种应用价值,尤其是在最近发展的 telescoping 密度比估计(DRE)方法中,该方法需要构建中间密度以桥接 $P$ 和 $Q$。本文提出一种 "Q-malizing 流" 模型,采用神经常微分方程(neural-ODE)架构,通过经验样本进行可逆变换训练(实现 $P$ 与 $Q$ 间的双向映射),并通过最小化传输成本进行正则化。该训练后的流模型使我们能够沿时间参数化的 $\log$-密度执行无穷小 DRE:通过额外训练一个基于分类损失的连续时间流网络,估计 $\log$-密度的时间偏导数。沿时间方向对时间得分网络进行积分,即可获得 $P$ 与 $Q$ 之间的 telescoping DRE,该方法比单步 DRE 更为稳定。本文通过高维数据互信息估计和基于能量的图像数据生成模型,实证验证了所提模型的有效性。