Decision trees built with data remain in widespread use for nonparametric prediction. Predicting probability distributions is preferred over point predictions when uncertainty plays a prominent role in analysis and decision-making. We study modifying a tree to produce nonparametric predictive distributions. We find the standard method for building trees may not result in good predictive distributions and propose changing the splitting criteria for trees to one based on proper scoring rules. Analysis of both simulated data and several real datasets demonstrates that using these new splitting criteria results in trees with improved predictive properties considering the entire predictive distribution.
翻译:基于数据构建的决策树在非参数预测中仍被广泛使用。当不确定性在分析与决策中起关键作用时,概率分布预测优于点预测。本研究探讨了如何改进决策树以生成非参数预测分布。我们发现标准建树方法可能无法产生良好的预测分布,并提出将树的分裂准则改为基于恰当评分规则的方法。对模拟数据及多个真实数据集的分析表明,采用新分裂准则构建的决策树在整体预测分布上具有更优的预测性能。