We introduce two new classes of measures of information for statistical experiments which generalise and subsume $\phi$-divergences, integral probability metrics, $\mathfrak{N}$-distances (MMD), and $(f,\Gamma)$ divergences between two or more distributions. This enables us to derive a simple geometrical relationship between measures of information and the Bayes risk of a statistical decision problem, thus extending the variational $\phi$-divergence representation to multiple distributions in an entirely symmetric manner. The new families of divergence are closed under the action of Markov operators which yields an information processing equality which is a refinement and generalisation of the classical data processing inequality. This equality gives insight into the significance of the choice of the hypothesis class in classical risk minimization.
翻译:我们引入两类新的统计实验信息度量,这两类度量推广并包含了两个或多个分布之间的 $\phi$-散度、积分概率度量、$\mathfrak{N}$-距离(MMD)以及 $(f,\Gamma)$-散度。这使我们能够推导出信息度量与统计决策问题的贝叶斯风险之间的一种简单几何关系,从而以完全对称的方式将变分 $\phi$-散度表示推广到多个分布。这些新的散度族在马尔可夫算子的作用下保持封闭,由此产生了一个信息处理等式,该等式是对经典数据处理不等式的改进和推广。这一等式揭示了经典风险最小化中假设类别选择的重要意义。