Adversarial examples are crafted by adding indistinguishable perturbations to normal examples in order to fool a well-trained deep learning model to misclassify. In the context of computer vision, this notion of indistinguishability is typically bounded by $L_{\infty}$ or other norms. However, these norms are not appropriate for measuring indistinguishiability for time series data. In this work, we propose adversarial examples in the Wasserstein space for time series data for the first time and utilize Wasserstein distance to bound the perturbation between normal examples and adversarial examples. We introduce Wasserstein projected gradient descent (WPGD), an adversarial attack method for perturbing univariant time series data. We leverage the closed-form solution of Wasserstein distance in the 1D space to calculate the projection step of WPGD efficiently with the gradient descent method. We further propose a two-step projection so that the search of adversarial examples in the Wasserstein space is guided and constrained by Euclidean norms to yield more effective and imperceptible perturbations. We empirically evaluate the proposed attack on several time series datasets in the healthcare domain. Extensive results demonstrate that the Wasserstein attack is powerful and can successfully attack most of the target classifiers with a high attack success rate. To better study the nature of Wasserstein adversarial example, we evaluate a strong defense mechanism named Wasserstein smoothing for potential certified robustness defense. Although the defense can achieve some accuracy gain, it still has limitations in many cases and leaves space for developing a stronger certified robustness method to Wasserstein adversarial examples on univariant time series data.
翻译:对抗样本是在正常样本上添加难以察觉的扰动,以欺骗训练良好的深度学习模型进行分类错误。在计算机视觉领域,这种不可察觉性通常由$L_{\infty}$或其他范数界定。然而,这些范数并不适用于衡量时间序列数据的不可察觉性。本文首次针对时间序列数据提出Wasserstein空间中的对抗样本,并利用Wasserstein距离约束正常样本与对抗样本之间的扰动。我们引入Wasserstein投影梯度下降(WPGD),一种用于扰动单变量时间序列数据的对抗攻击方法。我们利用一维空间中Wasserstein距离的闭式解,结合梯度下降法高效计算WPGD的投影步骤。进一步,我们提出两步投影,使Wasserstein空间中的对抗样本搜索受到欧几里得范数的引导和约束,以产生更有效且不易察觉的扰动。我们在医疗领域的多个时间序列数据集上对所提出的攻击方法进行了实证评估。大量结果表明,Wasserstein攻击功能强大,能够以较高的攻击成功率成功攻击大多数目标分类器。为深入探究Wasserstein对抗样本的本质,我们评估了一种名为Wasserstein平滑的强防御机制,以寻求潜在的认证鲁棒性防御。尽管该防御能在一定程度上提升准确率,但在许多情况下仍存在局限性,为开发针对单变量时间序列数据Wasserstein对抗样本的更强认证鲁棒性方法留出了空间。