We present a framework in which program analysis -- type checking, bug finding, and equivalence verification -- is organized as computing the Čech cohomology of a semantic presheaf over a program's site category. The presheaf assigns refinement-type information to observation sites and restricts it along data-flow morphisms. The cohomology group $H^{0}$ is the space of globally consistent typings. The first cohomology group $H^{1}$ classifies gluing obstructions -- bugs, type errors, and equivalence failures -- each localized to a specific pair of disagreeing sites. This formulation yields three concrete results unavailable in prior work: (1) the rank of $H^1$ over $F_{2}$ counts the minimum independent fixes; (2) $H_{1}(U, Iso) = 0$ is sound and complete for behavioral equivalence; (3) Mayer-Vietoris enables compositional, incremental obstruction counting. We implement the framework in Deppy, a Python analysis tool, and evaluate it on a suite of 375~benchmarks: 133~bug-detection programs, 134~equivalence pairs, and 108~specification-satisfaction checks. Deppy achieves {100% bug-detection recall} (69% precision, F1 = 81%), 99% equivalence accuracy with zero false equivalences, and 98% spec accuracy with zero false satisfactions -- outperforming mypy and pyright, which report zero findings on unannotated code. The analysis models Python semantics as algebraic geometry: variables live on the generic fiber (non-None) unless on the closed nullable subscheme, integers form Spec($\mathbb{Z}$) with no bounded section (no overflow), and short-circuit evaluation defines an open-set topology on the presheaf.
翻译:我们提出一个框架,将程序分析——类型检查、Bug发现和等价验证——组织为在程序位点范畴上计算语义预层的Čech上同调。该预层将精化类型信息赋予观察位点,并沿数据流态射限制这些信息。上同调群$H^{0}$是全局一致类型构成的空间,而第一上同调群$H^{1}$对黏合障碍——即Bug、类型错误和等价性失效——进行分类,每个障碍均定位于一对存在分歧的位点。这一表述产生了三项先前工作未有的具体成果:(1) 在$F_{2}$上$H^1$的秩可统计最小独立修复数量;(2) $H_{1}(U, Iso) = 0$是行为等价性的可靠且完备判定条件;(3) Mayer-Vietoris序列支持组合式、增量式的障碍计数。我们在Python分析工具Deppy中实现了该框架,并在包含375个基准测试的套件上进行了评估:133个Bug检测程序、134个等价性对以及108个规范满足性检查。Deppy实现了100%的Bug检测召回率(精确率69%,F1值81%)、99%的等价性准确率且零误报等价性、98%的规范准确率且零误报满足性——显著优于对未标注代码报告零发现的mypy和pyright。该分析将Python语义建模为代数几何:变量位于一般纤维(非None)上,除非处于闭的可空子概形;整数构成在$\mathbb{Z}$上的谱(无有界截面,即无溢出);短路求值定义了预层上的开集拓扑。