We consider training decision trees using noisily labeled data, focusing on loss functions that can lead to robust learning algorithms. Our contributions are threefold. First, we offer novel theoretical insights on the robustness of many existing loss functions in the context of decision tree learning. We show that some of the losses belong to a class of what we call conservative losses, and the conservative losses lead to an early stopping behavior during training and noise-tolerant predictions during testing. Second, we introduce a framework for constructing robust loss functions, called distribution losses. These losses apply percentile-based penalties based on an assumed margin distribution, and they naturally allow adapting to different noise rates via a robustness parameter. In particular, we introduce a new loss called the negative exponential loss, which leads to an efficient greedy impurity-reduction learning algorithm. Lastly, our experiments on multiple datasets and noise settings validate our theoretical insight and the effectiveness of our adaptive negative exponential loss.
翻译:我们研究了使用含噪声标签数据训练决策树的问题,重点关注能够实现鲁棒学习算法的损失函数。本文贡献包含三个方面。首先,我们在决策树学习背景下,为现有多种损失函数的鲁棒性提供了新颖的理论见解。研究表明,部分损失函数属于我们称之为保守损失类的范畴,这类损失在训练过程中会引发早停行为,并在测试阶段产生抗噪预测结果。其次,我们提出了一个用于构建鲁棒损失函数的框架——分布损失。这类损失基于假设的边际分布采用分位数惩罚机制,并通过鲁棒性参数自然地适应不同噪声率。特别地,我们引入了一种名为负指数损失的新型损失函数,该损失能引导出高效的贪心不纯度衰减学习算法。最后,我们在多个数据集和噪声设置下开展的实验验证了我们的理论见解以及自适应负指数损失的有效性。