Preserving privacy in contemporary NLP models allows us to work with sensitive data, but unfortunately comes at a price. We know that stricter privacy guarantees in differentially-private stochastic gradient descent (DP-SGD) generally degrade model performance. However, previous research on the efficiency of DP-SGD in NLP is inconclusive or even counter-intuitive. In this short paper, we provide an extensive analysis of different privacy preserving strategies on seven downstream datasets in five different `typical' NLP tasks with varying complexity using modern neural models based on BERT and XtremeDistil architectures. We show that unlike standard non-private approaches to solving NLP tasks, where bigger is usually better, privacy-preserving strategies do not exhibit a winning pattern, and each task and privacy regime requires a special treatment to achieve adequate performance.
翻译:在当代自然语言处理模型中保护隐私,使我们能够处理敏感数据,但遗憾的是,这需要付出代价。我们知道,差分隐私随机梯度下降(DP-SGD)中更严格的隐私保证通常会降低模型性能。然而,先前关于DP-SGD在自然语言处理中效果的研究尚无定论,甚至与直觉相悖。在这篇短文中,我们基于BERT和XtremeDistil架构的现代神经模型,对五种不同“典型”自然语言处理任务(难度各异)的七个下游数据集上的不同隐私保护策略进行了广泛分析。我们表明,与解决自然语言处理任务的标准非隐私方法(通常越大越好)不同,隐私保护策略并未展现出一种占优模式,每项任务和隐私机制都需要特殊处理才能达到足够的性能。