Many real-world applications provide a continuous stream of data that is subsequently used by machine learning models to solve regression tasks of interest. Hoeffding trees and their variants have a long-standing tradition due to their effectiveness, either alone or as base models in broader ensembles. At the same time a recent line of work in batch learning has shown that kernel density estimation (KDE) is an effective approach for smoothed predictions in imbalanced regression tasks [Yang et al., 2021]. Moreover, another recent line of work for batch learning, called hierarchical shrinkage (HS) [Agarwal et al., 2022], has introduced a post-hoc regularization method for decision trees that does not alter the structure of the learned tree. Using a telescoping argument we cast KDE to streaming environments and extend the implementation of HS to incremental decision tree models. Armed with these extensions we investigate the performance of decision trees that may enjoy such options in datasets commonly used for regression in online settings. We conclude that KDE is beneficial in the early parts of the stream, while HS hardly, if ever, offers performance benefits. Our code is publicly available at: https://github.com/marinaAlchirch/DSFA_2026.
翻译:许多现实应用提供连续的数据流,随后由机器学习模型用于解决相关的回归任务。霍夫丁树及其变体因其有效性而具有长期传统,无论是单独使用还是作为更广泛集成模型的基础模型。同时,批量学习领域的最新研究表明,核密度估计(KDE)是不平衡回归任务中实现平滑预测的有效方法[Yang等人,2021]。此外,批量学习的另一项最新研究——称为层次收缩(HS)[Agarwal等人,2022]——提出了一种决策树的事后正则化方法,该方法不会改变已学习树的结构。通过递推论证,我们将KDE推广至流式环境,并将HS的实现扩展至增量决策树模型。基于这些扩展,我们研究了在在线环境中常用于回归的数据集上,决策树采用此类选项的性能表现。我们的结论是:KDE在数据流早期阶段具有优势,而HS几乎无法提供性能增益。代码已公开于:https://github.com/marinaAlchirch/DSFA_2026。