MVPatch: More Vivid Patch for Adversarial Camouflaged Attacks on Object Detectors in the Physical World

from arxiv, 16 pages, 8 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Recent studies have shown that Adversarial Patches (APs) can effectively manipulate object detection models. However, the conspicuous patterns often associated with these patches tend to attract human attention, posing a significant challenge. Existing research has primarily focused on enhancing attack efficacy in the physical domain while often neglecting the optimization of stealthiness and transferability. Furthermore, applying APs in real-world scenarios faces major challenges related to transferability, stealthiness, and practicality. To address these challenges, we introduce generalization theory into the context of APs, enabling our iterative process to simultaneously enhance transferability and refine visual correlation with realistic images. We propose a Dual-Perception-Based Framework (DPBF) to generate the More Vivid Patch (MVPatch), which enhances transferability, stealthiness, and practicality. The DPBF integrates two key components: the Model-Perception-Based Module (MPBM) and the Human-Perception-Based Module (HPBM), along with regularization terms. The MPBM employs ensemble strategy to reduce object confidence scores across multiple detectors, thereby improving AP transferability with robust theoretical support. Concurrently, the HPBM introduces a lightweight method for achieving visual similarity, creating natural and inconspicuous adversarial patches without relying on additional generative models. The regularization terms further enhance the practicality of the generated APs in the physical domain. Additionally, we introduce naturalness and transferability scores to provide an unbiased assessment of APs. Extensive experimental validation demonstrates that MVPatch achieves superior transferability and a natural appearance in both digital and physical domains, underscoring its effectiveness and stealthiness.

翻译：近期研究表明，对抗补丁（APs）能有效操纵目标检测模型。然而，这些补丁常伴随明显的图案，易引起人类注意，构成重大挑战。现有研究主要集中于提升物理域攻击效能，却常忽视隐蔽性与可迁移性的优化。此外，在现实场景中应用APs面临可迁移性、隐蔽性及实用性的多重挑战。为应对这些挑战，我们将泛化理论引入APs研究框架，通过迭代过程同步增强可迁移性并优化与真实图像的视觉关联性。本文提出基于双重感知的框架（DPBF）以生成更逼真补丁（MVPatch），从而提升可迁移性、隐蔽性与实用性。DPBF整合两个核心组件：基于模型感知的模块（MPBM）与基于人类感知的模块（HPBM），并辅以正则化项。MPBM采用集成策略降低多检测器的目标置信度，在坚实理论支撑下提升APs的可迁移性；同时，HPBM引入轻量级视觉相似性实现方法，无需依赖额外生成模型即可创建自然且不显眼的对抗补丁。正则化项进一步增强了生成APs在物理域的实用性。此外，我们提出自然度与可迁移性评分指标，为APs提供客观评估标准。大量实验验证表明，MVPatch在数字域与物理域均展现出卓越的可迁移性与自然外观，充分证明其高效性与隐蔽性。