We consider training decision trees using noisily labeled data, focusing on loss functions that can lead to robust learning algorithms. Our contributions are threefold. First, we offer novel theoretical insights on the robustness of many existing loss functions in the context of decision tree learning. We show that some of the losses belong to a class of what we call conservative losses, and the conservative losses lead to an early stopping behavior during training and noise-tolerant predictions during testing. Second, we introduce a framework for constructing robust loss functions, called distribution losses. These losses apply percentile-based penalties based on an assumed margin distribution, and they naturally allow adapting to different noise rates via a robustness parameter. In particular, we introduce a new loss called the negative exponential loss, which leads to an efficient greedy impurity-reduction learning algorithm. Lastly, our experiments on multiple datasets and noise settings validate our theoretical insight and the effectiveness of our adaptive negative exponential loss.
翻译:我们研究了利用含噪声标签数据训练决策树的问题,重点分析能够产生鲁棒学习算法的损失函数。本文贡献有三方面。首先,我们为决策树学习背景下众多现有损失函数的鲁棒性提供了新颖的理论洞见。研究表明,部分损失函数属于我们称为保守损失函数的类别,这类损失函数在训练过程中会导致早停行为,并在测试阶段产生抗噪预测。其次,我们构建了一个鲁棒损失函数框架,即分布损失函数。这些损失函数基于假定的边际分布采用分位数惩罚,并通过鲁棒性参数自然适应不同噪声率。特别地,我们提出了一种名为负指数损失的新型损失函数,该函数可导出高效的贪婪不纯度下降学习算法。最后,我们在多个数据集和噪声设置下的实验验证了理论洞见以及自适应负指数损失函数的有效性。