Computational reproducibility is central to scientific credibility, yet verifying published results at scale remains costly. We develop an AI-assisted workflow for automated full-paper replication -- retrieving materials, reconstructing environments, executing code, and matching outputs to point estimates reported in regression tables. We define a universe of all empirical and quantitative papers from the three top political science journals (2010--2025) and measure stated data availability using automated extraction. For a stratified sample of 384 studies, we apply the workflow to conduct full-paper replication, totaling 3,382 empirical models. We find that journal verification requirements, combined with data archiving mandates, drive reproducibility: the full-paper reproducibility rate rises from 29.6% before DA-RT adoption to 79.8% after, and conditional on accessible replication packages, 94.4% of papers are fully reproducible (237/251). As a secondary application, we apply standardized IV diagnostics to 92 studies (215 specifications), illustrating how automated execution enables systematic reanalysis across heterogeneous empirical settings.
翻译:计算可重复性是科学可信性的核心,然而规模化验证已发表成果仍成本高昂。我们开发了一种人工智能辅助工作流程,用于自动化的全论文复制——检索材料、重建环境、执行代码,并将输出结果与回归表格中报告的点估计值进行匹配。我们界定了三大顶级政治学期刊(2010-2025年)中所有实证与定量论文的全体范围,并通过自动化提取方式测量了公开的数据可用性。针对384项研究的分层样本,我们应用该工作流程进行全论文复制,总计涉及3,382个实证模型。研究发现,期刊验证要求与数据归档强制措施共同推动了可重复性:全论文可重复率从DA-RT采纳前的29.6%提升至采纳后的79.8%,而在可获取复制包的情况下,94.4%的论文(237/251)完全可重复。作为次要应用,我们对92项研究(215个规范设定)应用了标准化工具变量诊断,展示了自动化执行如何实现跨异质性实证情境的系统性再分析。