大语言模型的可验证微调：与数据来源及策略绑定的零知识训练证明 (Verifiable Fine-Tuning for LLMs: Zero-Knowledge Training Proofs Bound to Data Provenance and Policy)

Large language models are often adapted through parameter efficient fine tuning, but current release practices provide weak assurances about what data were used and how updates were computed. We present Verifiable Fine Tuning, a protocol and system that produces succinct zero knowledge proofs that a released model was obtained from a public initialization under a declared training program and an auditable dataset commitment. The approach combines five elements. First, commitments that bind data sources, preprocessing, licenses, and per epoch quota counters to a manifest. Second, a verifiable sampler that supports public replayable and private index hiding batch selection. Third, update circuits restricted to parameter efficient fine tuning that enforce AdamW style optimizer semantics and proof friendly approximations with explicit error budgets. Fourth, recursive aggregation that folds per step proofs into per epoch and end to end certificates with millisecond verification. Fifth, provenance binding and optional trusted execution property cards that attest code identity and constants. On English and bilingual instruction mixtures, the method maintains utility within tight budgets while achieving practical proof performance. Policy quotas are enforced with zero violations, and private sampling windows show no measurable index leakage. Federated experiments demonstrate that the system composes with probabilistic audits and bandwidth constraints. These results indicate that end to end verifiable fine tuning is feasible today for real parameter efficient pipelines, closing a critical trust gap for regulated and decentralized deployments.

翻译：大语言模型通常通过参数高效微调进行适配，但当前的发布实践对所用数据及更新计算方式提供的保证较弱。本文提出可验证微调，这是一种协议与系统，能生成简洁的零知识证明，证实发布模型源自公开初始化，并遵循声明的训练程序及可审计的数据集承诺。该方法融合了五个要素：第一，将数据来源、预处理、许可证及每轮次配额计数器绑定至清单的承诺机制；第二，支持公开可重放及私有索引隐藏批量选择的可验证采样器；第三，限定于参数高效微调的更新电路，强制执行AdamW风格优化器语义及具有明确误差预算的证明友好近似；第四，递归聚合技术，将每步证明折叠为每轮次及端到端证书，验证时间仅需毫秒级；第五，来源绑定及可选的可信执行属性卡，用于验证代码身份与常量。在英语及双语指令混合数据集上的实验表明，该方法在严格预算内保持实用性，同时实现可行的证明性能。策略配额执行无误违规，私有采样窗口未检测到索引泄露。联邦实验证明，该系统可与概率审计及带宽约束协同工作。这些结果表明，端到端可验证微调在当前对实际参数高效流程是可行的，为受监管及去中心化部署填补了关键的信任缺口。