Gradient boosted decision trees, particularly XGBoost, are among the most effective methods for tabular data. As deployment in sensitive settings increases, cryptographic guarantees of model integrity become essential. We present ZKBoost, the first zero-knowledge proof of training (zkPoT) protocol for XGBoost, enabling model owners to prove correct training on a committed dataset without revealing data or parameters. We make three key contributions: (1) a fixed-point XGBoost implementation compatible with arithmetic circuits, enabling instantiation of efficient zkPoT, (2) a generic template of zkPoT for XGBoost, which can be instantiated with any general-purpose ZKP backend, and (3) vector oblivious linear evaluation (VOLE)-based instantiation resolving challenges in proving nonlinear fixed-point operations. Our fixed-point implementation matches standard XGBoost accuracy within 1\% while enabling practical zkPoT on real-world datasets.
翻译:梯度提升决策树(尤其是XGBoost)是处理表格数据最有效的方法之一。随着在敏感场景中部署的增加,模型完整性的密码学保证变得至关重要。我们提出ZKBoost——首个面向XGBoost的零知识训练证明(zkPoT)协议,使模型所有者能够在承诺的数据集上证明训练过程的正确性,同时不泄露数据或参数。我们作出三项核心贡献:(1)兼容算术电路的定点化XGBoost实现,为高效zkPoT实例化奠定基础;(2)可适配任意通用零知识证明后端的XGBoost通用zkPoT模板;(3)基于向量不经意线性评估(VOLE)的实例化方案,解决了非线性定点运算的证明难题。我们的定点化实现在保持与标准XGBoost精度误差小于1%的同时,实现了对现实数据集的实用化zkPoT。