Normalizing flows are a flexible class of probability distributions, expressed as transformations of a simple base distribution. A limitation of standard normalizing flows is representing distributions with heavy tails, which arise in applications to both density estimation and variational inference. A popular current solution to this problem is to use a heavy tailed base distribution. Examples include the tail adaptive flow (TAF) methods of Laszkiewicz et al (2022). We argue this can lead to poor performance due to the difficulty of optimising neural networks, such as normalizing flows, under heavy tailed input. This problem is demonstrated in our paper. We propose an alternative: use a Gaussian base distribution and a final transformation layer which can produce heavy tails. We call this approach tail transform flow (TTF). Experimental results show this approach outperforms current methods, especially when the target distribution has large dimension or tail weight.
翻译:归一化流是一类灵活的概率分布,通过简单基分布的变换表示。标准归一化流的一个局限在于难以表示具有重尾的分布,这类分布在密度估计和变分推断应用中均有出现。当前流行的解决方案是采用重尾基分布,例如Laszkiewicz等人(2022)提出的尾部自适应流(TAF)方法。我们认为,由于神经网络(如归一化流)在重尾输入下优化困难,这种方法可能导致性能不佳。本文通过实验论证了该问题。我们提出一种替代方案:采用高斯基分布,并通过最终变换层生成重尾特性。我们将此方法称为尾部变换流(TTF)。实验结果表明,该方法尤其在目标分布具有高维度或大尾部权重时,显著优于现有方法。