Incorrect handling of security-critical data formats, particularly in low-level languages, are the root cause of many security vulnerabilities. Provably correct parsing and serialization tools that target languages like C can help. Towards this end, we present PulseParse, a library of verified parser and serializer combinators for non-malleable binary formats. Specifications and proofs in PulseParse are in separation logic, offering a more abstract and compositional interface, with full support for data validation, parsing, and serialization. PulseParse also supports a class of recursive formats -- with a focus on security and handling adversarial inputs, we show how to parse such formats with only a constant amount of stack space. We use PulseParse at scale by providing the first formalization of CBOR, a recursive, binary data format standard, with growing adoption in various industrial standards. We prove that the deterministic fragment of CBOR is non-malleable and provide EverCBOR, a verified library in both C and Rust to validate, parse, and serialize CBOR objects implemented using PulseParse. Next, we provide the first formalization of CDDL, a schema definition language for CBOR. We identify well-formedness conditions on CDDL definitions that ensure that they yield unambiguous, non-malleable formats, and implement EverCDDL, a tool that checks that a CDDL definition is well-formed, and then produces verified parsers and serializers for it. To evaluate our work, we use EverCDDL to generate verified parsers and serializers for various security-critical applications. Notably, we build a formally verified implementation of COSE signing, a standard for cryptographically signed objects. We also use our toolchain to generate verified code for other standards specified in CDDL, including DICE Protection Environment, a secure boot protocol standard.
翻译:对安全关键数据格式的错误处理(尤其在低级语言中)是许多安全漏洞的根本原因。面向C等语言的可证明正确的解析与序列化工具能够提供帮助。为此,我们提出了PulseParse——一个针对不可篡改二进制格式的经过验证的解析器与序列化器组合子库。PulseParse中的规范与证明均采用分离逻辑,提供了更抽象、更具组合性的接口,并完整支持数据验证、解析与序列化功能。PulseParse还支持一类递归格式——着眼于安全性与对抗性输入处理,我们展示了如何仅使用常量栈空间解析此类格式。我们通过首次形式化CBOR(一种日益广泛应用于各类工业标准的递归二进制数据格式标准)来大规模应用PulseParse。我们证明了CBOR确定性片段具有不可篡改性,并开发了EverCBOR——一个基于PulseParse实现的、同时支持C与Rust的验证库,用于验证、解析和序列化CBOR对象。进一步,我们首次形式化了CDDL(一种CBOR的模式定义语言),确定了CDDL定义的良好性条件以确保其生成无歧义且不可篡改的格式,并实现了EverCDDL工具。该工具可检查CDDL定义的良好性,并据此生成经过验证的解析器与序列化器。为评估工作成果,我们使用EverCDDL为多个安全关键应用生成验证过的解析器与序列化器。特别地,我们构建了COSE签名(密码学签名对象标准)的形式化验证实现,并利用工具链为其他CDDL规范标准生成验证代码,包括安全启动协议标准DICE保护环境。