Compile-Time Tensor Shape Checking via Staged Shape-Dependent Types

When writing programs involving matrices or tensors in general, it is desirable to rule out the inconsistency of tensor shapes (i.e., the generalization of matrix sizes) before actual computation. For this purpose, some languages provide dependent types such as Mat m n, and others offer refinement types to track predicates for shapes. Despite the theoretical maturity, however, such methods are often unhandy for continuous software development due to the requirement of proofs for judging type equality or subtyping; even automated proving is often unsuitable due to its unforeseeable time consumption. To remedy this, our study provides an alternative formalization by using staging. Based on the observation that conditions for the shape consistency can be extracted before running the actual tensor computations in many typical cases, we ensure such consistency by assertions evaluated as compile-time computations, not by proofs. Under this formalization, we can verify the consistency virtually statically in the sense that inconsistencies as to tensor shapes will be immediately detected as assertion failures during compile-time computation. Our work achieves a mathematical guarantee that successfully generated code is always consistent with respect to tensor shapes. Furthermore, to vastly lessen the burden of adding shape- or stage-related descriptions, we (1) allow shape-related arguments to be implicit and infer them in a best-effort manner, and (2) offer a non-staged surface language that seemingly resembles ordinary dependently-typed languages and translate its programs into the staged core language. By a prototype implementation, we confirm that our language is expressive enough to verify a number of programs, including several examples offered by ocaml-torch.

翻译：在编写涉及矩阵或一般张量的程序时，希望在真正计算之前排除张量形状（即矩阵大小的推广）的不一致性。为此，某些语言提供了依赖类型（例如 Mat m n），另一些则提供细化类型以跟踪关于形状的谓词。然而，尽管这些方法在理论上已经成熟，但由于需要证明来判定类型相等或子类型关系，它们在持续软件开发中往往不够方便；即便是自动证明，也因其不可预见的时间消耗而常常不适合。为了解决这一问题，我们的研究通过使用阶段化提供了一种替代的形式化方法。基于在许多典型情况下，形状一致性的条件可以在实际张量计算运行之前提取出来的观察，我们通过断言（在编译时计算中进行求值）而非证明来确保这种一致性。在这种形式化下，我们可以在近乎静态的意义上验证一致性，即张量形状的不一致性会在编译时计算过程中立即作为断言失败被检测到。我们的工作实现了数学上的保证：成功生成的代码在张量形状上始终是一致的。此外，为了大幅减轻添加形状或阶段相关描述的负担，我们（1）允许形状相关参数为隐式参数，并以尽力而为的方式进行推断，以及（2）提供一种看似类似普通依赖类型语言的非阶段表面语言，并将其程序翻译到阶段化核心语言中。通过一个原型实现，我们确认我们的语言具有足够的表现力来验证多个程序，包括 ocaml-torch 提供的几个示例。