Despite the successes of probabilistic models based on passing noise through neural networks, recent work has identified that such methods often fail to capture tail behavior accurately, unless the tails of the base distribution are appropriately calibrated. To overcome this deficiency, we propose a systematic approach for analyzing the tails of random variables, and we illustrate how this approach can be used during the static analysis (before drawing samples) pass of a probabilistic programming language compiler. To characterize how the tails change under various operations, we develop an algebra which acts on a three-parameter family of tail asymptotics and which is based on the generalized Gamma distribution. Our algebraic operations are closed under addition and multiplication; they are capable of distinguishing sub-Gaussians with differing scales; and they handle ratios sufficiently well to reproduce the tails of most important statistical distributions directly from their definitions. Our empirical results confirm that inference algorithms that leverage our heavy-tailed algebra attain superior performance across a number of density modeling and variational inference tasks.
翻译:尽管基于噪声通过神经网络的概率模型取得了成功,但最近的研究发现,除非基分布的尾部经过适当校准,否则此类方法通常无法准确捕捉尾部行为。为了克服这一缺陷,我们提出了一种系统的方法来分析随机变量的尾部,并说明了如何在概率编程语言编译器的静态分析(在抽取样本之前)阶段使用这一方法。为了刻画尾部在各种运算下的变化,我们开发了一种代数,该代数作用于一个基于广义Gamma分布的三参数尾部渐近族。我们的代数运算在加法和乘法下是封闭的;能够区分具有不同尺度的次高斯分布;并且能够足够好地处理比值,从而直接根据定义重现大多数重要统计分布的尾部。我们的实证结果证实,利用我们提出的重尾代数的推理算法在多个密度建模和变分推理任务中取得了优越的性能。