Proof-Carrying Certificates for LLM Pipelines: A Trust-Boundary Architecture

We present a framework for verifying the deterministic structured computations surrounding a large language model rather than the model itself, extending a Lean 4 trust-boundary architecture to the generic interfaces of modern LLM pipelines. Certificate validity is a Lean 4 kernel type-check plus a sorry-free transitive axiom audit against the trusted set {propext, Classical.choice, Quot.sound}; other assumptions are declared and partitioned by tier (mathematical placeholders, cryptographic assumptions, ML/human oracles). The technical contribution comprises three local certificate families and two operators. The families are conflict-aware bilattice grounding (with an emission-gate soundness lemma), embedding sensitivity and paraphrase stability, and Hoare-style agent action. The operators are a Maximal Certifiable Residue, which turns abstention into the maximum-weight certifiable residue with audit-logged dropped claims, and a Compositional Stability theorem, which yields a closed-form pipeline-wide perturbation budget from per-layer gains and margins. The three families plus a Universal Assurance Card consolidator form the per-call deliverable for high-stakes deployments: patent and legal retrieval, regulated finance, clinical decision support, and agentic systems with irreversible side effects. A compiled Lean 4 reference artifact (Lean v4.30.0-rc2, Mathlib) covers all 22 certificate types, with 17 of 46 kernel-audited declarations axiom-free, the rest depending only on the trusted set and declared assumptions, and zero uses of sorryAx or Lean.ofReduceBool. The three families are empirically tested through four registered pilots: bilattice grounding on adversarially perturbed HotpotQA, embedding sensitivity in short- and long-form settings, and Hoare-style agent action on a filesystem sandbox with adversarial prompt injection.

翻译：我们提出了一种验证框架，旨在验证围绕大语言模型的结构化确定性计算过程（而非模型本身），将 Lean 4 信任边界架构扩展至现代大语言模型流程的通用接口。证书有效性通过 Lean 4 核心类型检查以及基于可信集合 {propext, Classical.choice, Quot.sound} 的无sorry（译者注：Lean中的未填充证明占位符）传递性公理审计来保证；其他假设按层级分类声明（数学占位符、密码学假设、机器学习/人类预言机）。技术贡献包括三个局部证书族和两个运算符。证书族为：冲突感知双格基化（附带发射门健全性引理）、嵌入敏感性与释义稳定性、以及霍尔风格智能体行动。运算符包括：最大可证明正确残差，它通过审计日志记录被舍弃声明将放弃转化为最大权重可证明正确残差；以及组合稳定性定理，它基于每层增益与余量导出闭式管道级扰动预算。这三个证书族与全能保证卡整合器共同构成高风险场景（专利与法律检索、受监管金融、临床决策支持、以及存在不可逆副作用的智能体系统）的每次调用交付物。一个经编译的 Lean 4 参考制品（Lean v4.30.0-rc2, Mathlib）覆盖全部22种证书类型，其中46个核审计声明中的17个无需公理，其余仅依赖可信集合与已声明假设，且未使用任何 sorryAx 或 Lean.ofReduceBool。通过四项注册试点对三个证书族进行实证测试：基于对抗扰动版 HotpotQA 的双格基化、短文本与长文本场景下的嵌入敏感性、以及通过文件系统沙箱及对抗性提示注入对霍尔风格智能体行动的验证。