AttackGNN: Red-Teaming GNNs in Hardware Security Using Reinforcement Learning

Machine learning has shown great promise in addressing several critical hardware security problems. In particular, researchers have developed novel graph neural network (GNN)-based techniques for detecting intellectual property (IP) piracy, detecting hardware Trojans (HTs), and reverse engineering circuits, to name a few. These techniques have demonstrated outstanding accuracy and have received much attention in the community. However, since these techniques are used for security applications, it is imperative to evaluate them thoroughly and ensure they are robust and do not compromise the security of integrated circuits. In this work, we propose AttackGNN, the first red-team attack on GNN-based techniques in hardware security. To this end, we devise a novel reinforcement learning (RL) agent that generates adversarial examples, i.e., circuits, against the GNN-based techniques. We overcome three challenges related to effectiveness, scalability, and generality to devise a potent RL agent. We target five GNN-based techniques for four crucial classes of problems in hardware security: IP piracy, detecting/localizing HTs, reverse engineering, and hardware obfuscation. Through our approach, we craft circuits that fool all GNNs considered in this work. For instance, to evade IP piracy detection, we generate adversarial pirated circuits that fool the GNN-based defense into classifying our crafted circuits as not pirated. For attacking HT localization GNN, our attack generates HT-infested circuits that fool the defense on all tested circuits. We obtain a similar 100% success rate against GNNs for all classes of problems.

翻译：机器学习在解决若干关键硬件安全问题上展现出巨大潜力。特别是，研究人员开发了基于图神经网络（GNN）的创新技术，用于检测知识产权（IP）盗版、硬件木马（HT）以及电路逆向工程等。这些技术表现出卓越的准确性，并引起了学术界的广泛关注。然而，由于这些技术用于安全应用，必须对其进行彻底评估，确保其鲁棒性，且不会危及集成电路的安全性。本文提出AttackGNN，这是首个针对硬件安全中基于GNN技术的红队攻击方法。为此，我们设计了一种新颖的强化学习（RL）智能体，用于生成针对GNN技术的对抗样本（即电路）。我们克服了有效性、可扩展性和通用性三个挑战，以构建强大的RL智能体。我们针对硬件安全中四类关键问题（IP盗版、检测/定位硬件木马、逆向工程和硬件混淆）的五种GNN技术进行了测试。通过我们的方法，我们构建的电路成功欺骗了本文考虑的所有GNN模型。例如，为了规避IP盗版检测，我们生成对抗性盗版电路，使基于GNN的防御系统将我们设计的电路分类为未盗版。在攻击硬件木马定位GNN时，我们的攻击在所有测试电路上生成了包含硬件木马的电路，成功欺骗了防御系统。针对所有类别的问题，我们均获得了类似的100%攻击成功率。