In the presence of grouped covariates, we propose a framework for boosting that allows to enforce sparsity within and between groups. By using component-wise and group-wise gradient boosting at the same time with adjusted degrees of freedom, a model with similar properties as the sparse group lasso can be fitted through boosting. We show that within-group and between-group sparsity can be controlled by a mixing parameter and discuss similarities and differences to the mixing parameter in the sparse group lasso. With simulations, gene data as well as agricultural data we show the effectiveness and predictive competitiveness of this estimator. The data and simulations suggest, that in the presence of grouped variables the use of sparse group boosting is associated with less biased variable selection and higher predictability compared to component-wise boosting. Additionally, we propose a way of reducing bias in component-wise boosting through the degrees of freedom.
翻译:针对分组协变量存在的情形,我们提出了一种能够同时实现组内和组间稀疏性的提升框架。通过同时使用基于分量和基于分组的梯度提升方法,并结合调整后的自由度,可以拟合出与稀疏组套索具有相似性质的模型。我们展示了组内和组间的稀疏性可通过一个混合参数进行控制,并探讨了该参数与稀疏组套索中混合参数的异同。通过模拟实验、基因数据以及农业数据的验证,我们证明了该估计方法的有效性和预测竞争力。数据与模拟结果表明,在存在分组变量的情况下,与基于分量的提升相比,稀疏组提升能够实现更少偏误的变量选择和更高的预测能力。此外,我们还提出了一种通过自由度减少基于分量提升中偏误的方法。