Shannon defined the mutual information between two variables. We illustrate why the true mutual information between a variable and the predictions made by a prediction algorithm is not a suitable measure of prediction quality, but the apparent Shannon mutual information (ASI) is; indeed it is the unique prediction quality measure with either of two very different lists of desirable properties, as previously shown by de Finetti and other authors. However, estimating the uncertainty of the ASI is a difficult problem, because of long and non-symmetric heavy tails to the distribution of the individual values of $j(x,y)=\log\frac{Q_y(x)}{P(x)}$ We propose a Bayesian modelling method for the distribution of $j(x,y)$, from the posterior distribution of which the uncertainty in the ASI can be inferred. This method is based on Dirichlet-based mixtures of skew-Student distributions. We illustrate its use on data from a Bayesian model for prediction of the recurrence time of prostate cancer. We believe that this approach is generally appropriate for most problems, where it is infeasible to derive the explicit distribution of the samples of $j(x,y)$, though the precise modelling parameters may need adjustment to suit particular cases.
翻译:香农定义了两个变量之间的互信息。我们阐释了为什么变量与预测算法预测值之间的真实互信息并不适合作为预测质量的度量指标,而表观香农互信息(ASI)则具备此功能;事实上,正如德菲内蒂等学者先前所证明的,ASI是唯一同时满足两组截然不同的理想属性列表的预测质量度量指标。然而,由于$j(x,y)=\log\frac{Q_y(x)}{P(x)}$各数值分布存在长尾及非对称厚尾特征,估计ASI的不确定性是一个难点问题。我们提出了一种基于贝叶斯建模的方法来描述$j(x,y)$的分布,通过该分布的后验概率可推导ASI的不确定性。该方法采用基于狄利克雷混合的偏斜t分布模型。我们以前列腺癌复发时间预测的贝叶斯模型数据为例进行方法应用示范。我们认为该方案适用于大多数无法直接推导$j(x,y)$样本显式分布的问题场景,但具体建模参数可能需要根据实际案例进行调整。