Transparency Attacks: How Imperceptible Image Layers Can Fool AI Perception

This paper investigates a novel algorithmic vulnerability when imperceptible image layers confound multiple vision models into arbitrary label assignments and captions. We explore image preprocessing methods to introduce stealth transparency, which triggers AI misinterpretation of what the human eye perceives. The research compiles a broad attack surface to investigate the consequences ranging from traditional watermarking, steganography, and background-foreground miscues. We demonstrate dataset poisoning using the attack to mislabel a collection of grayscale landscapes and logos using either a single attack layer or randomly selected poisoning classes. For example, a military tank to the human eye is a mislabeled bridge to object classifiers based on convolutional networks (YOLO, etc.) and vision transformers (ViT, GPT-Vision, etc.). A notable attack limitation stems from its dependency on the background (hidden) layer in grayscale as a rough match to the transparent foreground image that the human eye perceives. This dependency limits the practical success rate without manual tuning and exposes the hidden layers when placed on the opposite display theme (e.g., light background, light transparent foreground visible, works best against a light theme image viewer or browser). The stealth transparency confounds established vision systems, including evading facial recognition and surveillance, digital watermarking, content filtering, dataset curating, automotive and drone autonomy, forensic evidence tampering, and retail product misclassifying. This method stands in contrast to traditional adversarial attacks that typically focus on modifying pixel values in ways that are either slightly perceptible or entirely imperceptible for both humans and machines.

翻译：本文研究了一种新型算法漏洞，即不可见图像层可误导多种视觉模型产生任意标签分配和描述。我们探索了引入隐蔽透明度的图像预处理方法，从而触发AI对人类肉眼所见内容的误解。研究构建了广泛的攻击面，涵盖传统水印、隐写术及背景-前景错配等造成的后果。我们利用该攻击进行数据集投毒，通过单层攻击或随机选取的投毒类别，对灰度风景图与标识数据集进行错误标注。例如，人类肉眼看到的军用坦克，对基于卷积网络（如YOLO等）和视觉Transformer（如ViT、GPT-Vision等）的目标分类器而言，会被错误标注为桥梁。该攻击的一个显著局限性在于其依赖灰度背景（隐藏）层与人类肉眼所见的透明前景图像粗略匹配。这种依赖使得实际成功率需手动调整，且当放置于相反显示主题（例如浅色背景与浅色透明前景可见，在浅色主题图像查看器或浏览器中效果最佳）时会暴露隐藏层。该隐蔽透明度会干扰现有视觉系统，包括规避人脸识别与监控、数字水印、内容过滤、数据集筛选、汽车与无人机自主导航、法医证据篡改及零售产品误分类。该方法不同于传统对抗攻击——后者通常通过略微可察觉或完全不可察觉的方式修改像素值。