Learning to Localize Leakage of Cryptographic Sensitive Variables

from arxiv, Accepted to TMLR (Transactions on Machine Learning Research), 2026. Camera-ready version. 65 pages, 21 figures. Code available at https://github.com/jimgammell/learning_to_localize_leakage

While cryptographic algorithms such as the ubiquitous Advanced Encryption Standard (AES) are secure, *physical implementations* of these algorithms in hardware inevitably 'leak' sensitive data such as cryptographic keys. A particularly insidious form of leakage arises from the fact that hardware consumes power and emits radiation in a manner that is statistically associated with the data it processes and the instructions it executes. Supervised deep learning has emerged as a state-of-the-art tool for carrying out *side-channel attacks*, which exploit this leakage by learning to map power/radiation measurements throughout encryption to the sensitive data operated on during that encryption. In this work we develop a principled deep learning framework for determining the relative leakage due to measurements recorded at different points in time, in order to inform *defense* against such attacks. This information is invaluable to cryptographic hardware designers for understanding *why* their hardware leaks and how they can mitigate it (e.g. by indicating the particular sections of code or electronic components which are responsible). Our framework is based on an adversarial game between a classifier trained to estimate the conditional distributions of sensitive data given subsets of measurements, and a budget-constrained noise distribution which probabilistically erases individual measurements to maximize the loss of this classifier. We demonstrate our method's efficacy and ability to overcome limitations of prior work through extensive experimental comparison on 6 publicly-available power/EM trace datasets from AES, ECC and RSA implementations. Our PyTorch code is available at https://github.com/jimgammell/learning_to_localize_leakage.

翻译：虽然诸如广泛使用的高级加密标准（AES）等加密算法本身是安全的，但这些算法在硬件中的*物理实现*不可避免地会泄漏诸如加密密钥之类的敏感数据。一种特别隐蔽的泄漏形式源于硬件在消耗电能和发射电磁辐射时，其统计特性与其所处理的数据以及执行的指令相关。监督式深度学习已成为执行*侧信道攻击*的最先进工具，这种攻击通过利用泄漏，学习将加密过程中的功耗/电磁辐射测量值与在此期间操作处理的敏感数据对应起来。本文中，我们开发了一个规范的深度学习框架，用于确定不同时间点记录测量值所对应的相对泄漏量，从而为针对此类攻击的*防御*提供信息。该信息对于加密硬件设计者理解其硬件*为何*泄漏以及如何减轻泄漏（例如，通过指出导致泄漏的特定代码段或电子元件）具有不可估量的价值。我们的框架基于一个对抗性博弈：一方是训练用于在给定测量子集条件下估计敏感数据条件分布的分类器，另一方是一个受预算约束的噪声分布，该分布以概率方式擦除单个测量值，以最大化该分类器的损失。通过对源自AES、ECC和RSA实现的6个公开可用的功耗/电磁辐射迹数据集进行广泛的实验对比，我们展示了所提方法的有效性及其克服先前工作局限性的能力。我们的PyTorch代码可在https://github.com/jimgammell/learning_to_localize_leakage获取。