Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency - 专知论文

会员服务 ·

0

稳健性 · 稳健 · 一致 · 后门攻击 · 样本 ·

2023 年 3 月 27 日

Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency

翻译：基于腐败鲁棒性一致性的推理阶段后门检测

Xiaogeng Liu,Minghui Li,Haoyu Wang,Shengshan Hu,Dengpan Ye,Hai Jin,Libing Wu,Chaowei Xiao

from arxiv, Accepted by CVPR2023. Code is available at https://github.com/CGCL-codes/TeCo

Deep neural networks are proven to be vulnerable to backdoor attacks. Detecting the trigger samples during the inference stage, i.e., the test-time trigger sample detection, can prevent the backdoor from being triggered. However, existing detection methods often require the defenders to have high accessibility to victim models, extra clean data, or knowledge about the appearance of backdoor triggers, limiting their practicality. In this paper, we propose the test-time corruption robustness consistency evaluation (TeCo), a novel test-time trigger sample detection method that only needs the hard-label outputs of the victim models without any extra information. Our journey begins with the intriguing observation that the backdoor-infected models have similar performance across different image corruptions for the clean images, but perform discrepantly for the trigger samples. Based on this phenomenon, we design TeCo to evaluate test-time robustness consistency by calculating the deviation of severity that leads to predictions' transition across different corruptions. Extensive experiments demonstrate that compared with state-of-the-art defenses, which even require either certain information about the trigger types or accessibility of clean data, TeCo outperforms them on different backdoor attacks, datasets, and model architectures, enjoying a higher AUROC by 10% and 5 times of stability.

翻译：深度神经网络已被证明易受后门攻击。在推理阶段检测触发样本，即测试时触发样本检测，可以防止后门被激活。然而，现有检测方法通常要求防御者对受害模型具有高访问权限、额外的干净数据或关于后门触发器外观的知识，这限制了其实用性。本文提出测试时腐败鲁棒性一致性评估（TeCo），一种新颖的测试时触发样本检测方法，该方法仅需受害模型的硬标签输出，无需任何额外信息。我们的研究始于一个有趣的观察：对于干净图像，受后门感染的模型在不同图像腐败情况下表现出相似的性能，但对于触发样本却表现出不一致的性能。基于这一现象，我们设计TeCo通过计算不同腐败情况下导致预测转变的严重性偏差来评估测试时鲁棒性一致性。大量实验表明，与需要触发器类型某些信息或可访问干净数据的最先进防御方法相比，TeCo在不同后门攻击、数据集和模型架构上均优于它们，AUROC值高出10%，稳定性提升5倍。

1

相关内容

稳健性

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

专知会员服务

17+阅读 · 2022年3月28日

【ACL2022】解释生成的多尺度分布深度变分自编码器, Multi-Scale Distribution Deep Variational Autoencoder for Explanation Generation

【ACL2022】解释生成的多尺度分布深度变分自编码器, Multi-Scale Distribution Deep Variational Autoencoder for Explanation Generation

专知会员服务

12+阅读 · 2022年3月24日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

专知会员服务

15+阅读 · 2021年1月31日

首篇《后门学习综述》论文发布，阐述AI系统训练过程的安全性问题

专知会员服务

31+阅读 · 2020年11月21日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

专知会员服务

51+阅读 · 2020年3月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

KDD 2022 | 中科院计算所提出无监督高鲁棒性图结构学习框架—STABLE

KDD 2022 | 中科院计算所提出无监督高鲁棒性图结构学习框架—STABLE

PaperWeekly

0+阅读 · 2022年11月26日

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

PaperWeekly

0+阅读 · 2022年11月7日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

非平衡数据集 focal loss 多类分类

非平衡数据集 focal loss 多类分类

AI研习社

33+阅读 · 2019年4月23日

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

泡泡机器人SLAM

23+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

泡泡机器人SLAM

11+阅读 · 2018年12月17日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

基于对称识别方法的贝叶斯probit模型稳健性研究

国家自然科学基金

3+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于自学习对比度视觉注意模型和自适应深度特征的无分类目标检测

国家自然科学基金

2+阅读 · 2015年12月31日

基于高维大规模数据的集成建模方法的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于行为踪迹的网络蠕虫模型和检测方法

国家自然科学基金

0+阅读 · 2013年12月31日

11C-PD153035 PET/CT筛选非小细胞肺癌EGFR突变和监测EGFR-TKIs疗效的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于二型模糊逻辑的多核程序数据竞争与死锁检测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

市场竞争、不确定性与企业非效率投资- - 基于非效率投资确定方法的改进

国家自然科学基金

0+阅读 · 2012年12月31日

芯片硬件木马安全检测方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

用多重假设检验方法来研究方差变点问题

国家自然科学基金

0+阅读 · 2009年12月31日

GSURE-Based Diffusion Model Training with Corrupted Data

Arxiv

0+阅读 · 2023年5月22日

Enhanced Meta Label Correction for Coping with Label Corruption

Arxiv

0+阅读 · 2023年5月22日

Uncertainty-based Detection of Adversarial Attacks in Semantic Segmentation

Arxiv

0+阅读 · 2023年5月22日

Quantifying the effect of X-ray scattering for data generation in real-time defect detection

Arxiv

0+阅读 · 2023年5月22日

CCT-Code: Cross-Consistency Training for Multilingual Clone Detection and Code Search

Arxiv

0+阅读 · 2023年5月19日

CLEME: Debiasing Multi-reference Evaluation for Grammatical Error Correction

Arxiv

0+阅读 · 2023年5月18日

Measurement Based Evaluation and Mitigation of Flood Attacks on a LAN Test-Bed

Arxiv

0+阅读 · 2023年5月17日

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Arxiv

33+阅读 · 2023年2月18日

Do Feature Attribution Methods Correctly Attribute Features?

Arxiv

15+阅读 · 2021年12月15日

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Arxiv

15+阅读 · 2019年3月18日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

专知会员服务

2+阅读 · 6月19日

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

专知会员服务

4+阅读 · 6月19日

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

5+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

6+阅读 · 6月18日

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

专知会员服务

11+阅读 · 6月18日

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

专知会员服务

9+阅读 · 6月18日

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

专知会员服务

6+阅读 · 6月17日

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

专知会员服务

9+阅读 · 6月17日

学习数据的几何：形状空间分析数学综述

学习数据的几何：形状空间分析数学综述

专知会员服务

7+阅读 · 6月17日

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

专知会员服务

13+阅读 · 6月17日

定向能反无人机系统最新发展动态

定向能反无人机系统最新发展动态

专知会员服务

8+阅读 · 6月17日

从燃煤战舰到算法战争：水面指挥的永恒要求

从燃煤战舰到算法战争：水面指挥的永恒要求

专知会员服务

6+阅读 · 6月17日

《短程弹道再入飞行器拦截时间中的一项异常现象》

《短程弹道再入飞行器拦截时间中的一项异常现象》

专知会员服务

8+阅读 · 6月17日

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

专知会员服务

8+阅读 · 6月17日

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

专知会员服务

10+阅读 · 6月17日

相关VIP内容

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

专知会员服务

17+阅读 · 2022年3月28日

【ACL2022】解释生成的多尺度分布深度变分自编码器, Multi-Scale Distribution Deep Variational Autoencoder for Explanation Generation

【ACL2022】解释生成的多尺度分布深度变分自编码器, Multi-Scale Distribution Deep Variational Autoencoder for Explanation Generation

专知会员服务

12+阅读 · 2022年3月24日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

专知会员服务

15+阅读 · 2021年1月31日

首篇《后门学习综述》论文发布，阐述AI系统训练过程的安全性问题

专知会员服务

31+阅读 · 2020年11月21日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

专知会员服务

51+阅读 · 2020年3月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

相关资讯

KDD 2022 | 中科院计算所提出无监督高鲁棒性图结构学习框架—STABLE

KDD 2022 | 中科院计算所提出无监督高鲁棒性图结构学习框架—STABLE

PaperWeekly

0+阅读 · 2022年11月26日

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

PaperWeekly

0+阅读 · 2022年11月7日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

非平衡数据集 focal loss 多类分类

非平衡数据集 focal loss 多类分类

AI研习社

33+阅读 · 2019年4月23日

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

泡泡机器人SLAM

23+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

泡泡机器人SLAM

11+阅读 · 2018年12月17日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

相关论文

GSURE-Based Diffusion Model Training with Corrupted Data

Arxiv

0+阅读 · 2023年5月22日

Enhanced Meta Label Correction for Coping with Label Corruption

Arxiv

0+阅读 · 2023年5月22日

Uncertainty-based Detection of Adversarial Attacks in Semantic Segmentation

Arxiv

0+阅读 · 2023年5月22日

Quantifying the effect of X-ray scattering for data generation in real-time defect detection

Arxiv

0+阅读 · 2023年5月22日

CCT-Code: Cross-Consistency Training for Multilingual Clone Detection and Code Search

Arxiv

0+阅读 · 2023年5月19日

CLEME: Debiasing Multi-reference Evaluation for Grammatical Error Correction

Arxiv

0+阅读 · 2023年5月18日

Measurement Based Evaluation and Mitigation of Flood Attacks on a LAN Test-Bed

Arxiv

0+阅读 · 2023年5月17日

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Arxiv

33+阅读 · 2023年2月18日

Do Feature Attribution Methods Correctly Attribute Features?

Arxiv

15+阅读 · 2021年12月15日

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Arxiv

15+阅读 · 2019年3月18日

相关基金

基于对称识别方法的贝叶斯probit模型稳健性研究

国家自然科学基金

3+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于自学习对比度视觉注意模型和自适应深度特征的无分类目标检测

国家自然科学基金

2+阅读 · 2015年12月31日

基于高维大规模数据的集成建模方法的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于行为踪迹的网络蠕虫模型和检测方法

国家自然科学基金

0+阅读 · 2013年12月31日

11C-PD153035 PET/CT筛选非小细胞肺癌EGFR突变和监测EGFR-TKIs疗效的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于二型模糊逻辑的多核程序数据竞争与死锁检测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

市场竞争、不确定性与企业非效率投资- - 基于非效率投资确定方法的改进

国家自然科学基金

0+阅读 · 2012年12月31日

芯片硬件木马安全检测方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

用多重假设检验方法来研究方差变点问题

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员