Vision Transformers (ViTs) have emerged as powerful architectures in medical image analysis, excelling in tasks such as disease detection, segmentation, and classification. However, their reliance on large, attention-driven models makes them vulnerable to hardware-level attacks. In this paper, we propose a novel threat model referred to as Med-Hammer that combines the Rowhammer hardware fault injection with neural Trojan attacks to compromise the integrity of ViT-based medical imaging systems. Specifically, we demonstrate how malicious bit flips induced via Rowhammer can trigger implanted neural Trojans, leading to targeted misclassification or suppression of critical diagnoses (e.g., tumors or lesions) in medical scans. Through extensive experiments on benchmark medical imaging datasets such as ISIC, Brain Tumor, and MedMNIST, we show that such attacks can remain stealthy while achieving high attack success rates about 82.51% and 92.56% in MobileViT and SwinTransformer, respectively. We further investigate how architectural properties, such as model sparsity, attention weight distribution, and the number of features of the layer, impact attack effectiveness. Our findings highlight a critical and underexplored intersection between hardware-level faults and deep learning security in healthcare applications, underscoring the urgent need for robust defenses spanning both model architectures and underlying hardware platforms.
翻译:视觉Transformer(ViTs)已成为医学图像分析中的强大架构,在疾病检测、分割和分类等任务中表现出色。然而,其对大规模注意力驱动模型的依赖使其易受硬件级攻击。本文提出一种名为Med-Hammer的新型威胁模型,该模型将Rowhammer硬件故障注入与神经木马攻击相结合,以破坏基于ViT的医学成像系统的完整性。具体而言,我们展示了通过Rowhammer引发的恶意比特翻转如何触发植入的神经木马,导致医学扫描中出现针对性误分类或关键诊断(如肿瘤或病变)的抑制。通过在ISIC、脑肿瘤和MedMNIST等基准医学成像数据集上的大量实验,我们证明此类攻击可保持隐蔽性,同时在MobileViT和SwinTransformer中分别实现约82.51%和92.56%的高攻击成功率。我们进一步研究了模型稀疏性、注意力权重分布及层特征数量等架构特性如何影响攻击效果。我们的研究结果揭示了医疗应用中硬件级故障与深度学习安全之间关键且尚未充分探索的交叉领域,强调了跨越模型架构和底层硬件平台的鲁棒防御的迫切需求。