Misquoted No More: Securely Extracting F* Programs with IO

Shallow embeddings that use monads to represent effects are popular in proof-oriented languages because they are convenient for formal verification. Once shallowly embedded programs are verified, they are often extracted to mainstream languages like OCaml or C and linked into larger codebases. The extraction process is not fully verified because it often involves quotation -- turning the shallowly embedded program into a deeply embedded one -- and verifying quotation remains a major open challenge. Instead, some prior work obtains formal correctness guarantees using translation validation to certify individual extraction results. We build on this idea, but limit the use of translation validation to a first extraction step that we call relational quotation and that uses a metaprogram to construct a typing derivation for the given shallowly embedded program. This metaprogram is simple, since the typing derivation follows the structure of the original program. Once we validate, syntactically, that the typing derivation is valid for the original program, we pass it to a verified syntax-generation function that produces code guaranteed to be semantically related to the original program. We apply this general idea to build SEIO*, a framework for extracting shallowly embedded F* programs with IO to a deeply embedded lambda-calculus while providing formal secure compilation guarantees. Using two cross-language logical relations, we devise a machine-checked proof in F* that SEIO* guarantees Robust Relational Hyperproperty Preservation (RrHP), a very strong secure compilation criterion that implies full abstraction as well as preservation of trace properties and hyperproperties against arbitrary adversarial contexts. This goes beyond the state of the art in verified and certifying extraction, which so far has focused on correctness rather than security.

翻译：浅层嵌入使用单子表示效应，因其便于形式化验证而在面向证明的语言中广受欢迎。经过验证的浅层嵌入程序常被提取至OCaml或C等主流语言，并集成到更大的代码库中。但由于提取过程涉及引用（将浅层嵌入程序转换为深层嵌入形式），而引用的正确性验证仍是重大开放挑战，因此提取过程尚未实现完全验证。现有部分工作转而采用翻译验证方法，通过认证单个提取结果来获得形式化正确性保证。我们在该思想基础上进行改进，将翻译验证的适用范围限制在首个提取步骤（称为关系引用），该步骤使用元程序为给定的浅层嵌入程序构建类型推导。由于类型推导遵循原始程序的结构，该元程序的设计十分简洁。在语法层面验证类型推导对原始程序的有效性后，我们将该推导传递给经验证的语法生成函数，该函数能够产生与原始程序语义相关的代码。我们将这一通用方法应用于SEIO*框架的构建——该框架可在提供形式化安全编译保证的前提下，将含IO操作的浅层嵌入F*程序提取至深层嵌入lambda演算。通过构建两种跨语言逻辑关系，我们在F*中完成了机器验证证明：SEIO*能够保证鲁棒关系超属性保持（RrHP）。这一极强的安全编译准则既蕴含完全抽象性，又能在任意恶意上下文环境中保持迹属性和超属性。相较于当前聚焦于正确性而非安全性的认证提取与验证提取技术，本工作实现了重要突破。