Model averaging is a useful and robust method for dealing with model uncertainty in statistical analysis. Often, it is useful to consider data subset selection at the same time, in which model selection criteria are used to compare models across different subsets of the data. Two different criteria have been proposed in the literature for how the data subsets should be weighted. We compare the two criteria closely in a unified treatment based on the Kullback-Leibler divergence, and conclude that one of them is subtly flawed and will tend to yield larger uncertainties due to loss of information. Analytical and numerical examples are provided.
翻译:模型平均是统计分析中处理模型不确定性的一种有用且稳健的方法。通常,同时考虑数据子集选择也颇具价值,此时模型选择标准被用于比较不同数据子集上的模型。文献中提出了两种不同的标准来决定数据子集的加权方式。我们基于Kullback-Leibler散度对这两种标准进行了统一处理与深入比较,并得出结论:其中一种标准存在微妙缺陷,由于信息丢失,其往往会产生更大的不确定性。本文提供了解析与数值示例。