Model averaging is a useful and robust method for dealing with model uncertainty in statistical analysis. Often, it is useful to consider data subset selection at the same time, in which model selection criteria are used to compare models across different subsets of the data. Two different criteria have been proposed in the literature for how the data subsets should be weighted. We compare the two criteria closely in a unified treatment based on the Kullback-Leibler divergence, and conclude that one of them is subtly flawed and will tend to yield larger uncertainties due to loss of information. Analytical and numerical examples are provided.
翻译:模型平均是统计推断中处理模型不确定性的一种有效且稳健的方法。通常,同时考虑数据子集选择十分有用,此时需采用模型选择准则对不同数据子集上的模型进行比较。文献中已提出两种不同的准则用于确定数据子集的权重。本文基于库尔巴克-莱布勒散度在统一框架下对两者进行了详细比较,结论表明其中一种准则存在细微缺陷,会因信息损失导致不确定性增大。文中给出了理论分析与数值实例。