Learning with Errors (LWE) is a hard math problem underpinning many proposed post-quantum cryptographic (PQC) systems. The only PQC Key Exchange Mechanism (KEM) standardized by NIST is based on module~LWE, and current publicly available PQ Homomorphic Encryption (HE) libraries are based on ring LWE. The security of LWE-based PQ cryptosystems is critical, but certain implementation choices could weaken them. One such choice is sparse binary secrets, desirable for PQ HE schemes for efficiency reasons. Prior work, SALSA, demonstrated a machine learning-based attack on LWE with sparse binary secrets in small dimensions ($n \le 128$) and low Hamming weights ($h \le 4$). However, this attack assumes access to millions of eavesdropped LWE samples and fails at higher Hamming weights or dimensions. We present PICANTE, an enhanced machine learning attack on LWE with sparse binary secrets, which recovers secrets in much larger dimensions (up to $n=350$) and with larger Hamming weights (roughly $n/10$, and up to $h=60$ for $n=350$). We achieve this dramatic improvement via a novel preprocessing step, which allows us to generate training data from a linear number of eavesdropped LWE samples ($4n$) and changes the distribution of the data to improve transformer training. We also improve the secret recovery methods of SALSA and introduce a novel cross-attention recovery mechanism allowing us to read off the secret directly from the trained models. While PICANTE does not threaten NIST's proposed LWE standards, it demonstrates significant improvement over SALSA and could scale further, highlighting the need for future investigation into machine learning attacks on LWE with sparse binary secrets.
翻译:带误差学习(LWE)是一个困难的数学问题,支撑着许多提出的后量子密码(PQC)系统。NIST标准化的唯一PQC密钥交换机制(KEM)基于模块LWE,而当前公开可用的PQ同态加密(HE)库则基于环LWE。基于LWE的PQ密码系统的安全性至关重要,但某些实现选择可能削弱其安全性。其中一种选择是稀疏二值秘密,这出于效率考虑在PQ HE方案中备受青睐。先前的工作SALSA展示了一种基于机器学习的攻击方法,针对小维度(n ≤ 128)和低汉明重量(h ≤ 4)的稀疏二值秘密LWE问题。然而,该攻击假设能够获取数百万个窃听到的LWE样本,并且在更高的汉明重量或维度下会失败。我们提出了PICANTE,一种针对稀疏二值秘密LWE问题的增强型机器学习攻击,能够在更大维度(最高n=350)和更高汉明重量(约n/10,对于n=350最高h=60)下恢复秘密。我们通过一种新颖的预处理步骤实现了这一显著改进,该步骤使我们能够从线性数量的窃听LWE样本(4n)中生成训练数据,并改变数据分布以优化Transformer训练。我们还改进了SALSA的秘密恢复方法,并引入了一种新颖的交叉注意力恢复机制,从而能够直接从训练好的模型中读取秘密。尽管PICANTE并未威胁到NIST提出的LWE标准,但它展示了相较于SALSA的显著改进,并且可能进一步扩展,这凸显了未来研究针对稀疏二值秘密LWE问题的机器学习攻击的必要性。