Exponential families are statistical models which are the workhorses in statistics, information theory, and machine learning among others. An exponential family can either be normalized subtractively by its cumulant or free energy function or equivalently normalized divisively by its partition function. Both subtractive and divisive normalizers are strictly convex and smooth functions inducing pairs of Bregman and Jensen divergences. It is well-known that skewed Bhattacharryya distances between probability densities of an exponential family amounts to skewed Jensen divergences induced by the cumulant function between their corresponding natural parameters, and in limit cases that the sided Kullback-Leibler divergences amount to reverse-sided Bregman divergences. In this paper, we first show that the $\alpha$-divergences between unnormalized densities of an exponential family amounts to scaled $\alpha$-skewed Jensen divergences induced by the partition function. We then show how comparative convexity with respect to a pair of quasi-arithmetic means allows to deform both convex functions and their arguments, and thereby define dually flat spaces with corresponding divergences when ordinary convexity is preserved.
翻译:指数族是统计学、信息论及机器学习等领域中广泛应用的统计模型。一个指数族可通过对偶的减性归一化(利用其累积量函数或自由能函数)或除性归一化(利用其配分函数)实现。这两种归一化函数均为严格凸的平滑函数,并能诱导出对应的Bregman散度和Jensen散度对。众所周知,指数族概率密度间的斜向Bhattacharyya距离等价于由累积量函数在自然参数间诱导的斜向Jensen散度,且其极限情形下,单侧Kullback-Leibler散度等价于反向单侧Bregman散度。本文首先证明,指数族非归一化密度间的$\alpha$-散度等价于由配分函数诱导的缩放$\alpha$-斜向Jensen散度。进而,我们揭示如何利用关于一对拟算术均值的比较凸性来变形凸函数及其自变量,从而在保持普通凸性的情况下定义具有对应散度的对偶平坦空间。