Shannon defined the mutual information between two variables. We illustrate why the true mutual information between a variable and the predictions made by a prediction algorithm is not a suitable measure of prediction quality, but the apparent Shannon mutual information (ASI) is; indeed it is the unique prediction quality measure with either of two very different lists of desirable properties, as previously shown by de Finetti and other authors. However, estimating the uncertainty of the ASI is a difficult problem, because of long and non-symmetric heavy tails to the distribution of the individual values of $j(x,y)=\log\frac{Q_y(x)}{P(x)}$ We propose a Bayesian modelling method for the distribution of $j(x,y)$, from the posterior distribution of which the uncertainty in the ASI can be inferred. This method is based on Dirichlet-based mixtures of skew-Student distributions. We illustrate its use on data from a Bayesian model for prediction of the recurrence time of prostate cancer. We believe that this approach is generally appropriate for most problems, where it is infeasible to derive the explicit distribution of the samples of $j(x,y)$, though the precise modelling parameters may need adjustment to suit particular cases.
翻译:Shannon定义了变量之间的互信息。我们阐明了为什么变量与预测算法预测值之间的真实互信息不适合作为预测质量的度量,但表观Shannon互信息(ASI)是合适的;实际上,正如de Finetti及其他作者先前所证明的,ASI是唯一满足两种完全不同期望属性列表的预测质量度量。然而,估计ASI的不确定性是一个难题,这是因为个体值 \(j(x,y)=\log\frac{Q_y(x)}{P(x)}\) 的分布具有长尾且非对称的重尾特征。我们提出了一种基于贝叶斯建模的方法来描述 \(j(x,y)\) 的分布,通过其后验分布可以推断ASI的不确定性。该方法基于狄利克雷混合的偏态t分布。我们通过一个用于预测前列腺癌复发时间的贝叶斯模型数据,展示了该方法的实际应用。我们相信,在大多数无法显式推导 \(j(x,y)\) 样本分布的问题中,该方法具有普遍适用性,但具体的建模参数可能需要根据具体情况进行调整。