Verifying that a compiled binary originates from its claimed source code is a fundamental security requirement, called source code provenance. Achieving verifiable source code provenance in practice remains challenging. The most popular technique, called reproducible builds, requires difficult matching and reexecution of build toolchains and environments. We propose a novel approach to verifiable provenance based on compiling software with zero-knowledge virtual machines (zkVMs). By executing a compiler within a zkVM, our system produces both the compiled output and a cryptographic proof attesting that the compilation was performed on the claimed source code with the claimed compiler. We implement a proof-of-concept implementation using the RISC Zero zkVM and the ChibiCC C compiler, and evaluate it on 200 synthetic programs as well as 31 OpenSSL and 21 libsodium source files. Our results show that zk-compilation is applicable to real-world software and provides strong security guarantees: all adversarial tests targeting compiler substitution, source tampering, output manipulation, and replay attacks are successfully blocked.
翻译:验证编译后的二进制文件是否源自其声明的源代码是一项基本的安全要求,称为源代码溯源。在实践中实现可验证的源代码溯源仍然具有挑战性。当前最流行的技术——可复现构建——要求对构建工具链和环境进行困难的重匹配与重执行。我们提出了一种基于零知识虚拟机(zkVM)编译软件的可验证溯源新方法。通过在zkVM内执行编译器,我们的系统既能生成编译输出,也能产生一个密码学证明,证实编译过程是在声明的源代码上使用声明的编译器执行的。我们使用RISC Zero zkVM和ChibiCC C编译器实现了一个概念验证系统,并在200个合成程序以及31个OpenSSL和21个libsodium源文件上进行了评估。我们的结果表明,零知识编译适用于现实世界中的软件,并能提供强大的安全保证:所有针对编译器替换、源代码篡改、输出操纵和重放攻击的对抗性测试均被成功阻断。