Instrumental variable (IV) strategies are widely used in political science to establish causal relationships. However, the identifying assumptions required by an IV design are demanding, and it remains challenging for researchers to assess their validity. In this paper, we replicate 67 papers published in three top journals in political science during 2010-2022 and identify several troubling patterns. First, researchers often overestimate the strength of their IVs due to non-i.i.d. errors, such as a clustering structure. Second, the most commonly used t-test for the two-stage-least-squares (2SLS) estimates often severely underestimates uncertainty. Using more robust inferential methods, we find that around 19-30% of the 2SLS estimates in our sample are underpowered. Third, in the majority of the replicated studies, the 2SLS estimates are much larger than the ordinary-least-squares estimates, and their ratio is negatively correlated with the strength of the IVs in studies where the IVs are not experimentally generated, suggesting potential violations of unconfoundedness or the exclusion restriction. To help researchers avoid these pitfalls, we provide a checklist for better practice.
翻译:工具变量(IV)策略被广泛应用于政治学中以建立因果关系。然而,IV设计所需的识别假设要求严格,研究人员评估其有效性仍面临挑战。本文复制了2010-2022年间发表于政治学三大顶级期刊的67篇论文,并发现了若干令人担忧的模式。第一,由于非独立同分布(non-i.i.d.)误差(如聚类结构),研究人员常高估其IV的强度。第二,针对两阶段最小二乘法(2SLS)估计最常用的t检验往往严重低估不确定性。使用更具稳健性的推断方法,我们发现样本中约19-30%的2SLS估计效能不足。第三,在大多数复制研究中,2SLS估计值远大于普通最小二乘估计值,且二者比值与IV强度呈负相关——在非实验生成的IV研究中尤为显著,这表明可能存在未混杂性(unconfoundedness)或排他性限制(exclusion restriction)的违反。为帮助研究人员规避这些陷阱,我们提供了一份优化实践清单。