Robust Standard Errors for Bayesian Posterior Functionals via the Infinitesimal Jackknife

Quantitative research in the social and behavioral sciences relies heavily on nonlinear posterior functionals such as indirect effects, standardized coefficients, effect sizes, intraclass correlations, and multilevel variance-explained measures. The posterior standard deviation (PostSD) is the default uncertainty summary for these quantities, yet it presupposes a correctly specified model. When the working model is wrong, as is common with behavioral data that exhibit heavy tails and heteroskedasticity, PostSD can severely underestimate the frequentist standard error. The nonparametric bootstrap offers robustness but requires repeated MCMC refits, while the delta method demands a separate analytic gradient derivation for every new functional. The infinitesimal jackknife standard error (Giordano & Broderick, 2023) sidesteps both limitations: it approximates the bootstrap variance through influence functions computed from a single MCMC run, applies to any posterior functional without modification, and requires no analytic derivatives. We discuss the use the IJSE methodology at both the observation level and the cluster level and evaluate it through four simulation studies covering six functionals from mediation analysis, ANOVA, and multilevel modeling, which are commonly used in the social and behavioral sciences. Under misspecification, PostSD substantially underestimated the true standard error across all settings, whereas IJSE closely tracked the bootstrap at a fraction of the computational cost. Under correct specification all three methods agreed, confirming that IJSE introduces no distortion when the model is right. These results show IJSE as a practical, general-purpose tool for robust uncertainty quantification in Bayesian workflows throughout the social and behavioral sciences

翻译：社会科学与行为科学中的定量研究高度依赖非线性后验泛函，例如间接效应、标准化系数、效应量、组内相关系数以及多水平方差解释度量。后验标准差是这些量值的默认不确定性汇总指标，但其前提是模型设定正确。当工作模型错误时（如行为数据常见厚尾与异方差性），后验标准差可能严重低估频率学派标准误。非参数自举法虽提供稳健性，但需重复进行马尔可夫链蒙特卡洛重拟合，而德尔塔法则需为每个新泛函单独推导解析梯度。无穷小刀切标准误（Giordano & Broderick, 2023）同时规避了这两点局限：它通过单次MCMC运行计算影响函数来近似自举方差，无需修改即可适用于任意后验泛函，且无需解析导数。我们讨论在观测层级与聚类层级使用IJSE方法，并通过四项模拟研究对其评估，涵盖中介分析、方差分析与多水平建模中常用的六种泛函（广泛应用于社会科学与行为科学）。在模型误设下，后验标准差在所有设置中均大幅低估真实标准误，而IJSE以极低计算成本紧密追踪自举法结果。在模型正确设定下，三种方法结果一致，证实IJSE在模型正确时不会引入失真。这些结果表明IJSE可作为社会科学与行为科学贝叶斯工作流中实现稳健不确定性量化的实用通用工具。