Agda is a dependently-typed programming language and a proof assistant, pivotal in proof formalization and programming language theory. This paper extends the Agda ecosystem into machine learning territory, and, vice versa, makes Agda-related resources available to machine learning practitioners. We introduce and release a novel dataset of Agda program-proofs that is elaborate and extensive enough to support various machine learning applications -- the first of its kind. Leveraging the dataset's ultra-high resolution, detailing proof states at the sub-type level, we propose a novel neural architecture targeted at faithfully representing dependently-typed programs on the basis of structural rather than nominal principles. We instantiate and evaluate our architecture in a premise selection setup, where it achieves strong initial results.
翻译:Agda是一种依赖类型的编程语言和证明助手,在证明形式化与编程语言理论中具有核心地位。本文将Agda生态系统拓展至机器学习领域,同时反向推进,使Agda相关资源可供机器学习从业者使用。我们引入并发布了首个经过精心设计且规模足以支撑多种机器学习应用的Agda程序-证明数据集——同类中的首例。利用该数据集超高清的子类型级别证明状态细节,我们提出了一种新型神经架构,旨在基于结构原则而非名称原则忠实表示依赖类型程序。我们在前提选择框架中实例化并评估了该架构,取得了初步强劲成果。