Data poisoning considers cases when an adversary manipulates the behavior of machine learning algorithms through malicious training data. Existing threat models of data poisoning center around a single metric, the number of poisoned samples. In consequence, if attackers can poison more samples than expected with affordable overhead, as in many practical scenarios, they may be able to render existing defenses ineffective in a short time. To address this issue, we leverage timestamps denoting the birth dates of data, which are often available but neglected in the past. Benefiting from these timestamps, we propose a temporal threat model of data poisoning with two novel metrics, earliness and duration, which respectively measure how long an attack started in advance and how long an attack lasted. Using these metrics, we define the notions of temporal robustness against data poisoning, providing a meaningful sense of protection even with unbounded amounts of poisoned samples. We present a benchmark with an evaluation protocol simulating continuous data collection and periodic deployments of updated models, thus enabling empirical evaluation of temporal robustness. Lastly, we develop and also empirically verify a baseline defense, namely temporal aggregation, offering provable temporal robustness and highlighting the potential of our temporal threat model for data poisoning.
翻译:数据投毒考虑了对手通过恶意训练数据操纵机器学习算法行为的情形。现有数据投毒威胁模型主要围绕单一度量指标,即投毒样本数量。因此,如果攻击者能够以可承受的开销投毒超过预期的样本数量(如在许多实际场景中),他们可能在短时间内使现有防御措施失效。为解决此问题,我们利用表示数据生成时间的时间戳——这些信息通常可用但以往被忽视。借助这些时间戳,我们提出一种基于时间的数据投毒威胁模型,包含两个新度量指标:提前性和持续性,分别衡量攻击提前开始的时间长度和攻击持续的时间长度。利用这些指标,我们定义了针对数据投毒的时间鲁棒性概念,即便在投毒样本数量无界的情况下,也能提供有意义的保护。我们构建了一个包含评估协议的基准测试,该协议模拟连续数据收集和周期性部署更新模型的过程,从而实现对时间鲁棒性的实证评估。最后,我们开发并实证验证了一种基线防御方法——时间聚合,该方法能提供可证明的时间鲁棒性,凸显了我们的时间投毒威胁模型的潜力。