Component-wise gradient boosting algorithms are popular for their intrinsic variable selection and implicit regularization, which can be especially beneficial for very flexible model classes. When estimating generalized additive models for location, scale and shape (GAMLSS) by means of a component-wise gradient boosting algorithm, an important part of the estimation procedure is to determine the relative complexity of the submodels corresponding to the different distribution parameters. Existing methods either suffer from a computationally expensive tuning procedure or can be biased by structural differences in the negative gradients' sizes, which, if encountered, lead to imbalances between the different submodels. Shrunk optimal step lengths have been suggested to replace the typical small fixed step lengths for a non-cyclical boosting algorithm limited to a Gaussian response variable in order to address this issue. In this article, we propose a new adaptive step length approach that accounts for the relative size of the fitted base-learners to ensure a natural balance between the different submodels. The new balanced boosting approach thus represents a computationally efficient and easily generalizable alternative to shrunk optimal step lengths. We implemented the balanced non-cyclical boosting algorithm for a Gaussian, a negative binomial as well as a Weibull distributed response variable and demonstrate the competitive performance of the new adaptive step length approach by means of a simulation study, in the analysis of count data modeling the number of doctor's visits as well as for survival data in an oncological trial.
翻译:分量梯度提升算法因其内在的变量选择能力和隐式正则化特性而广受欢迎,这对高度灵活的模型类尤为有益。当通过分量梯度提升算法估计位置、尺度与形状的广义可加模型时,评估过程的关键环节是确定对应不同分布参数的子模型的相对复杂度。现有方法要么面临计算成本高昂的调优过程,要么可能因负梯度大小的结构差异而产生偏差,这种偏差会导致不同子模型间的失衡。为解决此问题,已有研究针对仅适用于高斯响应变量的非循环提升算法,提出了收缩优化步长以替代传统的固定小步长。本文提出了一种新的自适应步长方法,该方法通过考虑拟合基学习器的相对规模来确保不同子模型间的自然平衡。因此,这种新的平衡提升方法代表了一种计算高效且易于泛化的收缩优化步长替代方案。我们针对高斯、负二项及威布尔分布响应变量实现了平衡非循环提升算法,并通过模拟研究、分析医生就诊次数计数数据以及肿瘤学试验中的生存数据,展示了这种新自适应步长方法的竞争性表现。