Multicalibration gradient boosting has recently emerged as a scalable method that empirically produces approximately multicalibrated predictors and has been deployed at web scale. Despite this empirical success, its convergence properties are not well understood. In this paper, we provide computational guarantees for multicalibration gradient boosting algorithms. We show that the magnitude of successive prediction updates decays at $O(1/\sqrt{T})$, which implies the same convergence rate bound for the empirical multicalibration error over rounds. Under additional smoothness assumptions on the weak learners, this rate improves to linear convergence. We further establish convergence for adaptive variants. Experiments on real-world datasets support our theory and clarify the regimes in which the method achieves fast convergence.
翻译:多校准梯度提升最近作为一种可扩展方法出现,能够经验性地产生近似多校准的预测器,并已在网络规模场景中部署。尽管取得了这种经验成功,但其收敛性质尚未得到充分理解。在本文中,我们为多校准梯度提升算法提供了计算上的保证。我们证明,连续预测更新的幅度以$O(1/\sqrt{T})$衰减,这暗示了经验多校准误差在轮次上的相同收敛速率界限。在弱学习器具有额外平滑性假设的情况下,该速率可提升为线性收敛。我们进一步建立了自适应变体的收敛性。在真实世界数据集上的实验支持了我们的理论,并阐明了该方法实现快速收敛的场景。