Deep learning models have shown promising predictive accuracy for time-series healthcare applications. However, ensuring the robustness of these models is vital for building trustworthy AI systems. Existing research predominantly focuses on robustness to synthetic adversarial examples, crafted by adding imperceptible perturbations to clean input data. However, these synthetic adversarial examples do not accurately reflect the most challenging real-world scenarios, especially in the context of healthcare data. Consequently, robustness to synthetic adversarial examples may not necessarily translate to robustness against naturally occurring adversarial examples, which is highly desirable for trustworthy AI. We propose a method to curate datasets comprised of natural adversarial examples to evaluate model robustness. The method relies on probabilistic labels obtained from automated weakly-supervised labeling that combines noisy and cheap-to-obtain labeling heuristics. Based on these labels, our method adversarially orders the input data and uses this ordering to construct a sequence of increasingly adversarial datasets. Our evaluation on six medical case studies and three non-medical case studies demonstrates the efficacy and statistical validity of our approach to generating naturally adversarial datasets
翻译:深度学习模型在时间序列医疗应用场景中展现出优异的预测精度。然而,确保这些模型的鲁棒性对构建可信AI系统至关重要。现有研究主要聚焦于对合成对抗样本的鲁棒性——通过在干净输入数据中添加不可感知的扰动构建此类样本。但合成对抗样本无法准确反映最具有挑战性的真实场景,尤其是在医疗数据领域。因此,对合成对抗样本的鲁棒性未必能转化为对自然对抗样本的鲁棒性,而后者正是可信AI高度期望的特性。我们提出一种构建包含自然对抗样本的数据集方法用于评估模型鲁棒性。该方法基于通过自动化弱监督标注获得的概率标签——该标注方式整合了带噪声且易于获取的标注启发式规则。依据这些标签,我们的方法对抗性地对输入数据进行排序,并利用该排序构建序列化递增强度的对抗性数据集。在六项医疗案例研究及三项非医疗案例研究上的评估证明了该方法生成自然对抗数据集的有效性和统计有效性。