Statistical methods have been widely misused and misinterpreted in various scientific fields, raising significant concerns about the integrity of scientific research. To mitigate this problem, we propose a new method for formally specifying and automatically verifying the correctness of statistical programs. In this method, programmers are required to annotate the source code of the statistical programs with the requirements for these methods. Through this annotation, they are reminded to check the requirements for statistical methods, including those that cannot be formally verified, such as the distribution of the unknown true population. Our software tool StatWhy automatically checks whether programmers have properly specified the requirements for the statistical methods, thereby identifying any missing requirements that need to be addressed. This tool is implemented using the Why3 platform to verify the correctness of OCaml programs that conduct statistical hypothesis testing. We demonstrate how StatWhy can be used to avoid common errors in various popular statistical hypothesis testing programs.
翻译:统计方法在多个科学领域中被广泛误用和曲解,引发了关于科学研究完整性的重大关切。为缓解此问题,我们提出一种新方法,用于形式化地规约并自动验证统计程序的正确性。该方法要求程序员在统计程序的源代码中标注这些方法所需的前提条件。通过此类标注,程序员被提醒需要检查统计方法的前提条件,包括那些无法被形式化验证的条件(例如未知真实总体的分布)。我们的软件工具StatWhy可自动检查程序员是否正确规约了统计方法的前提条件,从而识别出需要补充的缺失条件。该工具基于Why3平台实现,用于验证执行统计假设检验的OCaml程序的正确性。我们通过多个常用统计假设检验程序的案例,展示了StatWhy如何帮助避免常见错误。