The recent success of machine learning (ML) has been fueled by the increasing availability of computing power and large amounts of data in many different applications. However, the trustworthiness of the resulting models can be compromised when such data is maliciously manipulated to mislead the learning process. In this article, we first review poisoning attacks that compromise the training data used to learn ML models, including attacks that aim to reduce the overall performance, manipulate the predictions on specific test samples, and even implant backdoors in the model. We then discuss how to mitigate these attacks using basic security principles, or by deploying ML-oriented defensive mechanisms. We conclude our article by formulating some relevant open challenges which are hindering the development of testing methods and benchmarks suitable for assessing and improving the trustworthiness of ML models against data poisoning attacks
翻译:机器学习(ML)的近期成功得益于众多应用领域中计算能力的不断增强与海量数据的日益可得。然而,当这些数据被恶意操纵以误导学习过程时,最终模型的可靠性可能受到损害。本文首先综述了针对训练数据的投毒攻击——这些攻击旨在破坏用于学习ML模型的数据,包括降低整体性能、操纵特定测试样本的预测结果,甚至向模型中植入后门等攻击方式。随后,我们探讨如何通过基础安全原则或部署面向ML的防御机制来缓解这些攻击。最后,本文提出了若干阻碍评估与提升ML模型抗数据投毒攻击可信度的测试方法与基准研发进展的关键性开放挑战。