In the realm of formal theorem proving, the Coq proof assistant stands out for its rigorous approach to verifying mathematical assertions and software correctness. Despite the advances in artificial intelligence and machine learning, the specialized nature of Coq syntax and semantics poses unique challenges for Large Language Models (LLMs). Addressing this gap, we present a comprehensive dataset specifically designed to enhance LLMs' proficiency in interpreting and generating Coq code. This dataset, derived from a collection of over 10,000 Coq source files, encompasses a wide array of propositions, proofs, and definitions, enriched with metadata including source references and licensing information. Our primary aim is to facilitate the development of LLMs capable of generating syntactically correct and semantically meaningful Coq constructs, thereby advancing the frontier of automated theorem proving. Initial experiments with this dataset have showcased its significant potential; models trained on this data exhibited enhanced accuracy in Coq code generation. Notably, a particular experiment revealed that a fine-tuned LLM was capable of generating 141 valid proofs for a basic lemma, highlighting the dataset's utility in facilitating the discovery of diverse and valid proof strategies. This paper discusses the dataset's composition, the methodology behind its creation, and the implications of our findings for the future of machine learning in formal verification. The dataset is accessible for further research and exploration: https://huggingface.co/datasets/florath/coq-facts-props-proofs-gen0-v1
翻译:在形式化定理证明领域,Coq证明助手因其对数学断言和软件正确性的严格验证方法而独树一帜。尽管人工智能与机器学习已取得显著进展,但Coq语法与语义的专业性对大型语言模型(LLMs)构成了独特挑战。为弥补这一不足,我们提出一个专门用于提升LLMs在Coq代码解释与生成方面能力的综合数据集。该数据集源自超过10,000个Coq源文件,涵盖广泛的命题、证明与定义,并附有来源引用及许可证信息等元数据。我们的主要目标是通过该数据集推动LLMs生成语法正确且语义有效的Coq结构,从而拓展自动化定理证明的前沿。初步实验表明,该数据集具有显著潜力;基于该数据集训练的模型在Coq代码生成准确率上表现更优。值得注意的是,一项特定实验发现,经过微调的LLM能够为基础引理生成141个有效证明,这凸显了该数据集在促进多样化有效证明策略探索方面的价值。本文讨论了数据集的构成、创建方法及其发现对形式验证领域机器学习发展的启示。该数据集可公开获取以供进一步研究与探索:https://huggingface.co/datasets/florath/coq-facts-props-proofs-gen0-v1