Instrumental variable (IV) strategies are widely used in political science to establish causal relationships. However, the identifying assumptions required by an IV design are demanding, and it remains challenging for researchers to assess their validity. In this paper, we replicate 67 papers published in three top journals in political science during 2010-2022 and identify several troubling patterns. First, researchers often overestimate the strength of their IVs due to non-i.i.d. errors, such as a clustering structure. Second, the most commonly used t-test for the two-stage-least-squares (2SLS) estimates often severely underestimates uncertainty. Using more robust inferential methods, we find that around 19-30% of the 2SLS estimates in our sample are underpowered. Third, in the majority of the replicated studies, the 2SLS estimates are much larger than the ordinary-least-squares estimates, and their ratio is negatively correlated with the strength of the IVs in studies where the IVs are not experimentally generated, suggesting potential violations of unconfoundedness or the exclusion restriction. To help researchers avoid these pitfalls, we provide a checklist for better practice.
翻译:工具变量(IV)策略被广泛用于政治学中的因果推断。然而,IV设计所需的识别假设十分严格,研究者难以有效评估其有效性。本文复现了2010-2022年间发表于政治学三大顶刊的67篇论文,并识别出若干值得关注的问题。首先,由于非独立同分布误差(如聚类结构),研究者往往高估工具变量的强度。其次,两阶段最小二乘(2SLS)估计中最常用的t检验常严重低估不确定性。采用更稳健的推断方法后,我们发现样本中约19%-30%的2SLS估计统计检验力不足。第三,在大多数复现研究中,2SLS估计值远大于普通最小二乘估计值,且两者比值与非实验生成的工具变量强度呈负相关,提示可能存在对无混杂性假设或排他性约束的违背。为帮助研究者规避此类陷阱,我们提供了优化实践的操作清单。