GTree: GPU-Friendly Privacy-preserving Decision Tree Training and Inference

Decision tree (DT) is a widely used machine learning model due to its versatility, speed, and interpretability. However, for privacy-sensitive applications, outsourcing DT training and inference to cloud platforms raise concerns about data privacy. Researchers have developed privacy-preserving approaches for DT training and inference using cryptographic primitives, such as Secure Multi-Party Computation (MPC). While these approaches have shown progress, they still suffer from heavy computation and communication overheads. Few recent works employ Graphical Processing Units (GPU) to improve the performance of MPC-protected deep learning. This raises a natural question: \textit{can MPC-protected DT training and inference be accelerated by GPU?} We present GTree, the first scheme that uses GPU to accelerate MPC-protected secure DT training and inference. GTree is built across 3 parties who securely and jointly perform each step of DT training and inference with GPU. Each MPC protocol in GTree is designed in a GPU-friendly version. The performance evaluation shows that GTree achieves ${\thicksim}11{\times}$ and ${\thicksim}21{\times}$ improvements in training SPECT and Adult datasets, compared to the prior most efficient CPU-based work. For inference, GTree shows its superior efficiency when the DT has less than 10 levels, which is $126\times$ faster than the prior most efficient work when inferring $10^4$ instances with a tree of 7 levels. GTree also achieves a stronger security guarantee than prior solutions, which only leaks the tree depth and size of data samples while prior solutions also leak the tree structure. With \textit{oblivious array access}, the access pattern on GPU is also protected.

翻译：决策树（Decision Tree，DT）是一种广泛使用的机器学习模型，因其通用性、速度和可解释性而备受青睐。然而，在隐私敏感的应用中，将DT训练和推理外包至云平台会引发数据隐私问题。研究者已利用密码学原语（如安全多方计算，简称MPC）开发出针对DT训练和推理的隐私保护方法。尽管这些方法取得了一定进展，但仍面临巨大的计算和通信开销。近期少数研究采用图形处理单元（GPU）来提升MPC保护的深度学习性能。这自然引发了一个问题：\textit{MPC保护的DT训练和推理能否通过GPU加速？}我们提出GTree——首个利用GPU加速MPC保护的安全DT训练与推理方案。GTree建立在三方协作架构之上，各方安全地协同执行DT训练与推理的每一步骤，并充分利用GPU。GTree中的每个MPC协议均设计为GPU友好版本。性能评估显示，相较于此前最高效的基于CPU的方案，GTree在训练SPECT和Adult数据集时分别实现了约${\thicksim}11{\times}$和${\thicksim}21{\times}$的加速。在推理方面，当决策树深度小于10层时，GTree展现出卓越效率：对7层树的$10^4$个实例进行推理时，其速度比此前最高效方案快$126{\times}$。此外，GTree实现了比先前方案更强的安全性保障——仅泄露树的深度与数据样本规模，而先前方案还会泄露树结构。通过\textit{ oblivious数组访问}，GPU上的访问模式同样受到保护。