Differential Privacy (DP) was originally developed to protect privacy. However, it has recently been utilized to secure machine learning (ML) models from poisoning attacks, with DP-SGD receiving substantial attention. Nevertheless, a thorough investigation is required to assess the effectiveness of different DP techniques in preventing backdoor attacks in practice. In this paper, we investigate the effectiveness of DP-SGD and, for the first time in literature, examine PATE in the context of backdoor attacks. We also explore the role of different components of DP algorithms in defending against backdoor attacks and will show that PATE is effective against these attacks due to the bagging structure of the teacher models it employs. Our experiments reveal that hyperparameters and the number of backdoors in the training dataset impact the success of DP algorithms. Additionally, we propose Label-DP as a faster and more accurate alternative to DP-SGD and PATE. We conclude that while Label-DP algorithms generally offer weaker privacy protection, accurate hyper-parameter tuning can make them more effective than DP methods in defending against backdoor attacks while maintaining model accuracy.
翻译:差分隐私(DP)最初旨在保护隐私,但近期被用于保护机器学习模型免受投毒攻击,其中DP-SGD方法受到广泛关注。然而,需要深入探究不同差分隐私技术在阻止后门攻击方面的实际有效性。本文首次在文献中系统评估DP-SGD与PATE在后门攻击场景中的防御效果,并深入分析DP算法各组件在防御后门攻击中的作用。研究表明,PATE因采用教师模型的装袋结构而能有效抵御此类攻击。实验揭示超参数设置与训练数据集中的后门数量会显著影响DP算法的防御成功率。此外,我们提出Label-DP作为DP-SGD和PATE的更快速且更准确的替代方案。结论指出:尽管Label-DP算法通常提供较弱的隐私保护,但通过精准的超参数调优,该算法在保持模型精度的前提下,其防御后门攻击的能力可能优于传统DP方法。