The Hardware Trojan (HT) problem can be thought of as a continuous game between attackers and defenders, each striving to outsmart the other by leveraging any available means for an advantage. Machine Learning (ML) has recently played a key role in advancing HT research. Various novel techniques, such as Reinforcement Learning (RL) and Graph Neural Networks (GNNs), have shown HT insertion and detection capabilities. HT insertion with ML techniques, specifically, has seen a spike in research activity due to the shortcomings of conventional HT benchmarks and the inherent human design bias that occurs when we create them. This work continues this innovation by presenting a tool called TrojanForge, capable of generating HT adversarial examples that defeat HT detectors; demonstrating the capabilities of GAN-like adversarial tools for automatic HT insertion. We introduce an RL environment where the RL insertion agent interacts with HT detectors in an insertion-detection loop where the agent collects rewards based on its success in bypassing HT detectors. Our results show that this process helps inserted HTs evade various HT detectors, achieving high attack success percentages. This tool provides insight into why HT insertion fails in some instances and how we can leverage this knowledge in defense.
翻译:硬件木马(HT)问题可视为攻击者与防御者之间持续博弈的过程,双方均试图利用一切可用手段智胜对方。近年来,机器学习(ML)在推动HT研究发展中发挥了关键作用。强化学习(RL)和图神经网络(GNN)等多种新兴技术已展现出HT植入与检测的能力。特别值得注意的是,由于传统HT基准测试的缺陷及人工设计时固有的认知偏差,采用ML技术进行HT植入的研究活动呈现激增态势。本研究延续这一创新方向,提出名为TrojanForge的工具,该工具能够生成可规避HT检测器的对抗性HT样本,展示了类GAN对抗工具在自动化HT植入方面的潜力。我们构建了一个RL环境,其中RL植入代理与HT检测器在植入-检测循环中交互,代理根据其成功绕过HT检测器的程度获取奖励。实验结果表明,该方法能有效帮助植入的HT逃逸多种检测器,实现较高的攻击成功率。该工具不仅揭示了HT植入在某些情况下失败的原因,更为防御策略的构建提供了理论依据。